March 31, 1998
TEL: (919) 515-6764
E-mail: mcclure@eos.ncsu.edu

MEMORANDUM

TO: ShootOut Participants

FROM: W. F. McClure, Professor

SUBJECT: RULES FOR the SOFTWARE SHOOTOUT at the IDRC98
("THE CHAMBERSBURG CONFERENCE")
Wilson College, Chambersburg, Pennsylvania, August 9-14, 1998.

INTRODUCTION

The Software Shootout has been a popular session at the IDRC. Its purpose is to encourage chemometric studies and draw attention to the plethora of software that exists with the hope of discovering the best method(s) of analysis, both quantitative and qualitative. This year, for the first time, the Shootout will be held as a formal part of the conference. There will be at least two invited participants; The remaining time will be open to anyone who would like to enter into the exchange of ideas.

Participants in the ShootOut are given two data sets (or files) of consisting of scans and chemistry obtained from fescue grass grown in a soil medium where the moisture was wicked from liquid tanks containing four levels of fertilization (0, 50, 250 and 500 ppm of nitrogen). Each level of fertilization was duplicated giving a total of 8 tanks. The purpose of the experiment was to address an environmental problem that confronts growers every year with increasing intensity: How much fertilizer should be added to maximize production while, at the same time, minimizing the environmental consequences of over fertilization?

There are a minimum of two questions that this study should address: (1) Can NIR spectrometry measure the nitrogen status of plant material? And, (2) Is this information related to fertilization? (Of course, there are other considerations that the Shooters may want to take into account after they look at the data.)

DATA FILES

Files with a suffix of *.DA1 and *.CN1 (chemical data) are NSAS formatted files for spectral and constituent data respectively. Files with *.txt suffix are in ASCII format. There are eight (8) files. A short description of the files are given as follows:

1. WD0.DA1 - (n = 282) - Wet/green samples with double scans on 6500. The grass samples were scanned in their wet-green state within 12 hours after harvesting (two scans per sample with the second scan from a repack aliquot). Blind-duplicate chemical (reference lab) analyses of freeze-dried samples are attached; one analysis value is attached to each of the duplicate scans.
2. WD0.CN1 - Constituent data for the above file.
3. wd0sp.txt - (n = 282) - Wet/green spectra in ASCII MATLAB matrix format.
4. wd0cn.txt - Constituent data in ASCII format for above samples.
5. PS0.DA1 - (n = 141) - Powdered (dry ground) with one scan of each sample. The related chemical values are the average of the blind duplicates.
6. PS0.CN1 - Constituent data for the above file.
7. ps0sp.txt - (n = 141) - Single scans of powdered samples.
8. ps0cn.txt - Constituent data in ASCII format for above samples.

SPECTRAL DATA

Neither the spectral data nor the constituent data has been doctored in any way. All spectra came from grass samples in a planned experiment as stated above. There are no trick this year as in the past. The spectra and constituent are real-life data. Thus, shooters can attack the data without worrying about substitutions or reassigned values.

CONSTITUENT DATA

The chemistry was determined on a LECO CNS-2000 Carbon, Nitrogen and Sulphur Analyzer. This instrument is a non-dispersive, infrared, microcomputer based instrument designed to measure the carbon, nitrogen and sulphur content in a wide variety of organic compounds. Carbon and sulphur are measured by infrared radiation detection; nitrogen is determined by conductivity.

Nitrogen, sulphur and carbon were analyzed in blind duplicates and are attached, for example, as nitrogen a (average of the two duplicates), nitrogen 1 and nitrogen 2 as the duplicates respectively. Hence, there are nine (9) values associated with each spectral file, three for each constituent and one fertilization parameter.


FILE CODES

The three data sets are in two format: (1) NSAS and (2) ASCII (MatLab) formats. Upper-case filenames are in NSAS (FOSS NIRSystems) format; lower-case filenames are in ASCII - MatLab format. Letter-codes used in the filenames are defined as follows:

1. First Letter in Filename (sample constitution):
W = wet green sample scanned directly after harvesting
P = dry sample ground through a 1 mm screen in a Wiley mill
Q = dry sample (P) scanned on a 19 filter instrument
2. Second Letter in Filename (number of scans per sample):
D = Double scans (repacks)
S = Single scans (no repacks)
3. Third Letter in Filename (pretreatments):
0 = no pretreatment

All data for the shootout have been posted on the IDRC website by Dr. Rob Lodder and his associates at the University of Kentucky:

http://kerouac.pharm.uky.edu/asrg/cnirs/cnirs.html


Remember, there will be at least two invited presentations; volunteers will be accommodated on a first come basis. Invited presenters will have a maximum of 30 minutes; volunteer presenters will have a maximum of 15 minutes. A discussion period will be held at the end. Comments, observations and criticisms by the audience will be considered at this time. Prizes will be awarded according to the ruling of a panel of three judges. The decision of the judges ( based on originality, systematics and novelty of both qualitative and quantitative analyses and best presentation ) will be final.

OBJECTIVE: All participants are asked to:

A. Develop their best calibration for each of the three constituents using any software they choose.
B. Both Qualitative and Quantitative (outliers, etc.) results should be reported.

PREVIOUS WINNERS of the SHOOTOUT will be allowed to defend their position in the ShootOut. Decisions of the Judges are final. Presentation of awards will be made at the banquet on Thursday night. Winners MUST be present to win.

VOLUNTEER PRESENTERS should contact Fred McClure as soon as possible:
email: mcclure@eos.ncsu.edu
FAX: 919-515-7760
TEL: 919-515-6764