Research Data Management Course

29 - 31 January 2008
IT Learning Center


RDM Course Participants (from L to R): Oliver Castillo, Teodoro Correa Jr., Jack Deodato Jacob, Christine Kreye, Mervin Manalili, Lilia Molina, Rubenito Lampayan, Ambrocio Castañeda, Marnol Santos, Emmali Manalo, Thomas Metz, Violeta Bartolome, and Lizzida Llorca, together with Marie Joy Sy.
Enlarge
RDM Course Participants (from L to R): Oliver Castillo, Teodoro Correa Jr., Jack Deodato Jacob, Christine Kreye, Mervin Manalili, Lilia Molina, Rubenito Lampayan, Ambrocio Castañeda, Marnol Santos, Emmali Manalo, Thomas Metz, Violeta Bartolome, and Lizzida Llorca, together with Marie Joy Sy.


COURSE

The initial offering of the Research Data Management Course took place last January 29-31 at the IT Learning Center with 10 IRRI staff as participants.

This problem-oriented course covered file management at the research group level, disciplined use of spreadsheets for data capture, minimal use of databases for managing data, and various tools for visualizing and analyzing research data. The emphasis was on the database part, with the aim to drastically reduce manual (cut-and-paste) data transformation in spreadsheets.

Participants learned how to improve the efficiency and effectiveness of the data flow from data capture to analysis.

This 3 half-day course was conducted by Dr. Thomas Metz, Ms. Emmali Manalo and Ms. Beng Bartolome (CRIL) with assistance from IT Learning Center and IT Services.


TRAINING PROGRAM

Course Schedule (pdf)

DAY 1 – File Management and Disciplined Use of Excel

I. Course Overview (pdf) – T.Metz 8:00 – 9:00

II. Setting up your workstation - E.Manalo 9:00 – 9:30

A. Mapping a shared network drive
B. Date format (operating system, Excel, Access)
C. Shortcut to the R statistics software

III. Managing Files in the Organizational Unit Repositories 9:30 – 10:15

A. Directory hierarchy and permissions in OU file repository - T.Metz
B. Directory comparison and synchronization using BeyondCompare - E.Manalo
C. Disk space management using WinDirStat - E.Manalo
D. Compressing a directory tree for backup - E.Manalo

BREAK 10:15 – 10:30

IV. Disciplined use of Excel for research data capture 10:30 – 11:30

A. Naming and formatting good practice - E.Manalo
B. Organization of raw data in sheets – T.Metz
C. Protecting spreadsheet data from accidental change – T.Metz

DISCUSSION 11:30 – 12:00


DAY 2 – Getting Data into the Database

I. Minimal use of Access for research data management - E.Manalo 8:00 – 8:45

A. Creating tables
B. Importing data from a spreadsheet to a database
C. Creating and running SQL database queries

II. Using SQL for data transformation and retrieval – T.Metz 8:45 – 10:00

A. Basic Query Syntax
  1. Selecting ordered subsets of data
  2. Creating new variables
  3. Summarizing data
  4. Stacking queries

BREAK 10:00 – 10:20

B. Advanced Query Syntax 10:20 – 11:30
  1. Combining data side-by-side using JOIN
  2. Combining data head-to-tail using UNION
  3. Reshaping data
C. In-depth discussion of some complex queries
  1. Numeric outliers

DISCUSSION 11:30 – 12:00


DAY 3 – Getting Data Out of the Database for Analysis

I. SQL Revisited - T.Metz 8:00 – 9:00

II. Exporting Data from a Table or a Query – E.Manalo 9:00 – 9:20

A. Copy and Paste to Excel
B. Export to different format (.csv and .txt)
C. Linking Excel dynamically to Access

III. Analyzing data using CropStat – B.Bartolome 9:20 – 9:35

IV. Visualizing and analyzing data using SAS – B.Bartolome 9:35 – 9:50

A. Getting Started with SAS
B. Reading data from Excel and Access

III. Visualizing and analyzing data using R - T.Metz 9:50 – 10:10

A. Setting up R in workstation
B. Reading data from Excel and Access


V. Dynamic, interactive, linked visualization of data using Mondrian – E.Manalo 10:10 – 10:30

OPEN FORUM


Last modified April 1, 2008 8:04 am