Monday 25 August 2014

Automating the export of edat2 files from E-DataAid


So, you have a data set collected in E-Prime and you want to export for processing in another tool suite?  Since it takes half a dozen mouse clicks per file, you can only image how un-fun this gets once the data set grows to dozens or hundreds of files.

There are three methods for exporting a data set of edat2 files that I know of:

1) Manually click your way through the whole data set and export each individually.
2) Use E-Merge to merge the all the files into one huge file... then export it in one step... then figure out how to process it (either split it back into individual participant chunks or do a population study on the whole lot)

3) Use the below script to get it all done automagically in one mouseclick.

 My problem is that each year I have dozens of E-Prime data sets to handle.  This has resulted in needing to process literally hundreds of edat2 files every year.  I usually teach the student how to do the manual export method but that was still prone to human error and often resulted in me having to go and find a file that had been miss-named or skipped in the process.  More time and effort spent.  This is a repetative problem with no variability... there had to be an automated solution.

So after appropriate googling around and some futile autohotkey hacking, I contacted PST help and found a helpful customer support person.  After explaining my need and establishing that the first two solutions above were not suitable for my problem, the support person produced some documentation for a simple command line interface to E-DataAid.  This provided me with a scriptable interface that I needed.  Add a little Perl hacking and tada.... solution.

Find below my solution and some notes.

The Solution for Exporting edat2 files to text files

For this solution I used perl as my scripting language of choice. I recomend the ActiveState ActivePerl. 

To use this file you will need perl and E-Prime 2 installed on the computer.  Then simply copy this script into a text file and name it something useful, like "DumpEdatToText.pl" and save it in the directory with the edat2 files.

The doubleclick the perl script to run it. It should export each edat2 file to a tab-delimited text file named with the participant id and session id. For example "p1s1.csv"  (Note: while this is not strictly speaking a CSV file, the file extension makes it easy to import into Excel for my purposes)

Following is my perl script:

#get all the edat2 files in the current directory
my @files = glob("*.edat2");

#process file each individually
foreach my $file(@files){

    print $file . " being processed\n";

    #get the participant ID and session ID from the file name
    my $p = "partID not found";
    #my $p = substr $file, index($file, "-"), 1;
    if ($file =~ /\-([^-]+)\-/)
    {
        $p = $1;
    }
   
    my $s = "SessionID not found";
    if ($file =~ /\-([^-]+)\./)
    {
        $s = $1;
    }
   
    my $outfileName = "p" . $p . "s" . $s . ".csv";

    #process the command file
    my $theCommandFileName = "cmdFile.txt";
    unless( open cmdFile, '>:crlf', $theCommandFileName){
        die "\nUnable to open $theCommandFileName\n";
    }

    print cmdFile ("Inheritance=true" . "\n");
    print cmdFile ("InFile=" . $file . "\n");
    print cmdFile ("OutFile=" . $outfileName . "\n");
    print cmdFile ("ColFlags=0" . "\n");
    print cmdFile ("ColNames=1" . "\n");
    print cmdFile ("Comments=0" . "\n");
    print cmdFile ("BegCommentLine=" . "\n");
    print cmdFile ("EndCommentLine=" . "\n");
    print cmdFile ("DataSeparator=    " . "\n"); #<tab character
    print cmdFile ("VarSeparator=    " . "\n"); #<tab character
    print cmdFile ("BegDataLine=" . "\n");
    print cmdFile ("EndDataLine=" . "\n");
    print cmdFile ("MissingData=" . "\n");
    print cmdFile ("Unicode=1" . "\n");

    close cmdFile;
   
    #pass the command file to E-DataAid for export
    system('"C:\Program Files (x86)\PST\E-Prime 2.0\Program\E-DataAid.exe" /e /f cmdFile.txt');
}


End Perl Script

Notes

Error Messages from E-DataAid
Error 1
No Output and no error message generated when running the following command
"C:\Program Files (x86)\PST\E-Prime 2.0\Program\E-DataAid.exe" /e  exampleB.txt

Solution – forgot to include the /f flag on command line. Now works.
"C:\Program Files (x86)\PST\E-Prime 2.0\Program\E-DataAid.exe" /e /f exampleB.txt
The missing flag should generate an error message.

Error 2
Message: “Error Reading Unicode from command file”
Solution – Added “Unicode=1” to end of command file.  Guessed that it’s a flag field. Seems to work.

Error 3
Message: “Error Reading E-Prime data file: D:\TestDumper\test.edat2 file” 
Solution – Had the full path to the file in the command file. Replaced with a relative path and the command worked. 
InFile=D:\TestDumper\test.edat2  <-Failed
InFile=nBackVerbal-1-1.edat2 <- Worked

Error 4
Leaving the “InFile=” field empty will crash E-DataAid
E-DataAid should correctly handle and report the missing field.

Error 5
Leaving the “OutFile=” field empty will generate a spurious error message.  “Error exporting to text file: .” 
E-DataAid  should detect the missing field and correctly report that there is no output file name supplied.

Error 6
Absolute file path in the “InFile=” field to edat2 file cause failure.
This may be due to assuming that the file is in the current working directory????
Solution- Use a relative path or work in the same directory.

Wednesday 30 July 2014

Research Instrument Design and Anonymity

Here's a scenario...

Your research design means you are collecting data using an online survey instrument.  You have ethical clearance to collect data from your participant population anonymously. You then embed a question in the survey asking the participants to provide their email address if they would like a summary of the results of the research; as per your ethical obligation.

Later....

When looking at the raw data from your instrument, you see a row of the responses provided by the participant... next to their email address. 

Now tell me how this is anonymous... and show your working!

Why is this a problem?

1) You know the population that was invited to participate.
2) You know some or all of the email addresses of those participants. By elimination you may be able to guess more identities. With additional demographic information you can further refine your guesses.
3) You now need to store and secure the data from your instrument at a much higher level of security.
4) You are now required to store ALL this data for seven years in such a way that it can be reivewed by third parties (who you do not know) at any time in the future.
5) This data exists on multiple computer systems already.
6) This data may leak in a number of obvious and in-obvious ways.
7) You are ledgislated to handle and secure this data as of 12/3/2014.
8) It's really hard to be sure that data is actually deleted. Sooner or later, a search engine will find it.

How is this ethical research?  How are you in compliance with the NHMRC guidelines?  How are you in compliance with the Australian Privacy Legislation? How are you in compliance with the University Policies? 

The government has passed some updates to the privacy laws and now they have real and specific applicability to this data.

Penalties for non-compliance

 You are individually liable for penalties if you do not comply (up to $220,000 for individuals, $1.7 Million for organisations... I.e your employer, the University)  See the Privacy Legislation below.

What is Personal Information?

 Personal information has the meaning as set out in s 6 of the Privacy Act:
information or an opinion (including information or an opinion forming part of a database), whether true or not, and whether recorded in a material form or not, about an individual whose identity is apparent, or can reasonably be ascertained, from the information or opinion.
Sensitive information is a subset of personal information. The Privacy Act defines sensitive information as:
  1. information or an opinion about an individual’s:
    1. racial or ethnic origin; or
    2. political opinions; or
    3. membership of a political association; or
    4. religious beliefs or affiliations; or
    5. philosophical beliefs; or
    6. membership of a professional or trade association; or
    7. membership of a trade union; or
    8. sexual preferences or practices; or
    9. criminal record;
    that is also personal information; or
  2. health information about an individual; or
  3. genetic information about an individual that is not otherwise health information.

What are your obligations once you collect Personal Information in a research data set?

  • Securely storing the data (Copying, publishing, data requests, integrity investigations)
  • Securely destroy the data
  • Authentication of access to the data
  • Manditory reporting of data breaches
  • Transfer of control of the data
  • Hosting the data on foreign servers
  • Implications of the Freedom of Information Act
The list of obligations and the cost for compliance is quite substantial..... are you sure you want to collect this information for your research project?

Academic Email is subject to the Freedom of Information Act under Australian Law.  This means that if your data set has been emailed (say between the student and their supervisor) then it could be leaked via that mechanims.

What can Students and Supervisors do?

Avoid this whole mess by not collecting personally identifying information (Email Addresses specifically) as part of the data set. Design around this potential risk.

Do not sample very small, specific, known populations. 

Beware when the Ethics review hands back a requirement for this kind of mechanism in your study.  Be prepared to push back with some alternate design strategies to avoid this problem.

Be aware of the legislation and the implications of compliance.  

Design research to seperate identity and data, if the participants are known.  Do not embed their identities in the data set or research materials (which must then be stored and shared)

Alternate Design Strategies

Case 1 - Ethical Requirement for optional feedback of research results to research participants.

The recomended (by me) strategy is to provide the particiants with a contact email address (researcher or supervisor) from whom they can request a copy of the results of the research.
This strategy avoids the issue of collecting and holding a list of email addresses with their associated cost and the risk of violating the anonymity of the participants. 

Case 2 - Repeated measures design requiring followup contact with participants.

Request participants to contact the researcher and be added to a pool prior to the data collection starting.  Then the researcher can broadcast to this list an anonymous link to the data collection instrument at each measure time.

This provides a dis-connect between the participants activity and their identity.  Unless the researcher has a very small pool or makes other attempts to link the particiant and their data... there is no way to identify who has provided which data record.

This then allows the list of email addresses to be stored seperatly and destroyed securely independ of the data set that results from the research. 


Further Reading

10 Steps to Protect Other Peoples Personal Information
http://www.oaic.gov.au/privacy/privacy-resources/privacy-fact-sheets/other/privacy-fact-sheet-7-ten-steps-to-protect-other-people-s-personal-information

How to de-identify data
http://www.oaic.gov.au/privacy/privacy-resources/privacy-business-resources/privacy-business-resource-4-de-identification-of-data-and-information

The 17  Australian Privacy Principles (APPs)

General information on Information security
http://www.oaic.gov.au/privacy/privacy-resources/privacy-guides/guide-to-information-security



Thursday 5 June 2014

Random Stimuli Sequences

When is a sequence of stimuli "Random" enough?

Students often have a "pre-conceived" idea of what they think "Random" looks like.  Which if you think about it for a second is interesting in itself.  (Go look up Apophenia)

The correct answer is: When then sequence appears to have no pattern perceivable by the participant!

Note that this is not the same as "No pattern perceivable to the researcher".  The researcher is often conditioned to the sequence simply by their understanding of the research design and having tested the experiment a few (or hundreds) of times.  Their brain is already trying to detect patterns in the sequence.  This is what brains do.  

This is always in contrast to a naive participant who will only experience the sequence once.  (If you design calls for repeated measures... different design)


What are the options?


Random Sequence With replacement


Imagine a bag that contains all the possible sequence items,  reach in and take an item without looking,  record the item and then return it to the bag.  Repeat as needed. 

This method uses a sampling mechanism of reaching blindly into a bag.  This means that each sample is independant.

For example, if we have  possible items in the bag (5,9,4,2,7), then a sequence of five samples could be any of the following:

5,9,4,2,7

5,5,5,5,5

7,7,7,7,2

5,9,5,9,5

While these sequences that appear to be patterns are "legal" and can quite possibly be generated using this mechanism, there are  5 * 5 * 5 * 5 * 5 (3125) possible different sequences that could be generated. 

Much like a coin-walk,  the distribution of the items in this random system will approch being perfectly even.... as the sample size grows... but it will not be anything like even at small sample sizes. Keep in mind that for any participant, who expereinces the sequence once, this is a sample size of 1... I.e lots of noise in the distribution. . Due to this fact, these sequences that appear to have a pattern are generally not what the researchers "want" to see.  And while this is still a very effective mechanism for generating the stimuli sequence, the possiblity of there appearing to be a pattern can mess with the researchers head. 

Random Sequence Without Replacement

Imagine a bag that contains all the possible sequence items,  reach in and take an item without looking, record that item but do not return it to the bag.  Now pick the next item from those remaining in the bag.  Obviously the bag will eventually be exhausted.  (At which point your sequence may be finished, or you may return all items to the bag and start again)

For instance, again with our items of ( 5,9,4,2,7), some sequences could be

7,5,9,2,4

2,4,5,7,9

There are 5 * 4 * 3 * 2 * 1 (120) possible sequences if we are creating five item sequences.

If we are creating say a ten item sequence, then we would run this method twice. 

i.e 2,4,5,7,9,7,5,9,2,4

Which would give us a possible  120 * 120 ( 14400 ) possible sequences.

This mechanism will generate sequences that do not have the same possibility of repetition of items as the first mechanism. 

The distribution of items in this method is much closer to even at small sample sizes and so creates a safe feeling for the researchers.  However, keep in mind that where the number of items in the bag is small, this can create a repeating sequence that the participants can still detect.  (7 plus or minus 2 being a useful rule of thumb for the number of items a person can remember )

Generally, if the number of sequence items is less than 9, the participant can start to anticipate by elimination what item may come next.  This is unavoidable, as its part of the "normal" function of the brain. The longer the sequence runs... the better they will feel at this.  (This does not mean they will be right... but over time, it can be better than chance for people who are good at this task)


So what is the best mechanism to generate your stimuli sequence?

Well this gets tricky because often stimuli sequences involve repetition of some items, some contains distractor items with their own frequency, others may include cueing rules and other rules for follow order.  Some sequences include "idiot check" items, some are trying to cause patterns and anticipation, while others are trying to control for these effects.... some have blocks of similar stimuli, some have intruders, control blocks and neutral effect stimuli.  There are as many permutations as there are researchers and research design.

Find some experts and talk it over.  There is a sequence generator for you out there.

The biggest problem however is when the right generator has been build and verified and the researcher says "I don't think its random enough...."


Wednesday 4 June 2014

SCU Interlectual Property Rights in Research Projects

http://policies.scu.edu.au/view.current.php?id=00017

Optical Illustion Resources

http://news.distractify.com/culture/mind-blowing-optical-illusions/?v=1

Copyright, Video and Model Releases for Research Purposes


The current copyright legal framework.   The general rule is you need to obtain permission to use from the rights holder.  This may include payment for the license to use.  There are some exemptions under the copyright act for education and research purposes. 

COPYRIGHT ACT 1968 - SECT 40 

Fair dealing for purpose of research or study

 There are various issues that have arisen since the copyright act was written (digital technology) that are more complex to argue.  There are proposals to change the copyright system, but it has not yet reached the legislation stage.  

ALRC Proposals to change Copyright - Educational Use section


Copyright and the Digital Economy (DP 79) 5 June 2013

Fair Use and various examples...



Taking Photographs/Video for use in research projects need an SCU model release form signed by the model. This needs to be archived with your project records.

Talent Release /Permission to Use Form

Tuesday 3 June 2014

Survey Design - Participant Information Sheets


The following is a good template for a functional participant information sheet:  (Your milage may vary) This follows the format in the ethics application docs.

Research Title: SOME TITLE HERE

My name is *Researcher Name* and I am conducting research, under the supervision of *Suerpvisor Name*, as part of the Honours year of the Bachelor of Psychology degree at Southern Cross University. My research project investigates the *very brief description*

What does this research involve?

*Description of the process from the participants point of view...* This questionnaire will include questions about your .... This questionnaires should take approximately 20 minutes so overall. All data collection can be done from any computer with internet access. What time of day you decide to complete the questionnaire is completely up to you.

Responsibilities of the Researcher

It is our responsibility as the researchers to provide you with sufficient information to understand the implications of participation in the research. Your participation in the study is voluntary. To ensure anonymity...... Demographic details, such as age and sex, will be linked to the data. We will keep data and consent forms securely in a Psychology Office at SCU for seven years, after which time they will be destroyed. On completion of participation in the study, we are happy to provide you with a summary of the results if you provide contact details. We are unable to provide information on individual scores or provide any clinical advice.

Responsibilities of the Participant

The measures used in the study require that participants have normal or corrected-to-normal vision and hearing. Although there have been no reports of adverse reactions to the methods used in this project, if at any time you are bothered by the experience of recording your emotions or thinking about personality variables during the online component we are asking that you access support as you require. We will give you contact details for the SCU Counselling Service, but you may choose to contact your GP or other sources of support if necessary. As the tasks involved in this project require that you log on the internet, we are looking for participants who:
  • have access to a computer
  • are fluent in English
  • are at least 18 years of age.

Possible Discomforts and Risks

This research is considered very low risk. It is possible that reflecting on your mood or personality may cause you concern. If you are distressed as a result of participating in this research, you are advised to contact a counsellor (e.g., SCU Counselling Service, (02) 6659 3263, or by email on counselling@scu.edu.au) or other forms of support of your choosing. You are welcome to discontinue your involvement at any time without any negative consequences.

Publication of Results of this Research

The findings of this research may be submitted for publication in peer-reviewed journals or presented at conferences. A complete account of the findings of this research will be available at the Southern Cross University Library at a later stage. No identifying information will be included as all participants will just be referred to by a number.

Informed Consent

Participation in this research is voluntary. Your agreement to participate will be assumed from your completion of the survey. You are free to withdraw your consent and participation from the study at any time. Your decision will be treated with respect, no questions will be asked and there will be no negative consequences associated with your withdrawal. All your data will be destroyed immediately.

Inquiries

Any questions you may have regarding this research can be directed to the researcher or research supervisor.
*Supervisor Name here*
Lecturer
Psychology
School of Health and Human Sciences
Southern Cross University,
Coffs Harbour, NSW 2450
Ph: *****
Email:*****


Student Researcher Details
*Researcher Name Here*
Psychology
Southern Cross University
Coffs Harbour Campus NSW 2450
Email: *****

This research has been approved by the Southern Cross University Human Research Ethics Committee. The approval number is: ******.

If you have concerns about the ethical conduct of the research, please write to:
The Ethics Complaints Officer, Southern Cross University
PO Box 157
Lismore
NSW 2480
email: ******

 Do you give Consent

By pressing the "Next" button, you consent to your data being used in this project. Remember you can withdraw at any time simply by closing the survey window. 

Survey Questions on Gender


Question:  How to ask demographics questions about gender.

The scenario:  You are creating a survey that includes a demographics section and asks about the gender of the participant.  You are interested in equity and diversity as dictated by the policy of the University and respecting your participants rights under state and federal Law....


Old form of the question:

What is your gender?
A) Female
B) Male


Whats changed? The Australian High Court has upheld the right of individuals to be identified as being of "non-specific" gender. Note that this is a NSW specific issue, but other states are working on their own legislation.

http://www.abc.net.au/news/2014-04-02/high-court-recognises-gender-neutral/5361362


New form of the question:

What is your gender?
A) Female
B) Male
C) Non-specific

Please note that this is not nearly as inclusive as it could be, but its a step forward.   There are more issues here that need attention if you are a social researchers.

  • Distinction between public and private gender 
  • Self selected terms used to identify current gender state
  • Distinction between physical gender expression and social gender/role expression
  • Distinction between biological gender (DNA) and physical gender expression
  • Gender Role descriptions

 As always, with survey design, there is a tention between asking a question to shape the answers given for the conveinence of the researcher and asking a question to be as inclusive as possible.  If your question format is categorical (multiple choice) then you may need to include an "Other" text entry box to fully capture the range of answers your sample population may choose to respond with.




Friday 23 May 2014

Resources for Digital Researchers


Below are the most useful tools I have found for dealing with the "Digital Researcher" problem.



SCU "Official" University Publications Managment System

These are the minimum systems you need to get set up to be "official".

ePublications@SCU (Division of Research is the gatekeeper - Submit Pub here)
Personal Researcher Pages at SCU  ( This is a "Managed" system so you can create the profile but SCU will "Control" it.  Create the profile here )
Staff Directory ( Update your listing here )


Academic Social Networks (Also for Publication Management)

Both the following systems are useful.  I prefer RG for the pubs management and A.edu for the discovery system for other interesting stuff. 


ResearchGate
Academia.edu

Publication Managment via Researcher Identity Numbers

At some point you will need to register for one or all of these to submit publications to a journal.  ORCID seems to be the top of the pile.  They are interacting well with each other so they are all linked together. I recomend starting with ORCID.

ORCID
ResearcherID
Google Scholar

General Purpose Social Networks

Twitter
Facebook
Linkedin

Blogging 

Blogger via a Google account

Friday 21 March 2014

Door Pin Codes in the Coffs Harbour Psychology Labs

Hi All,

This post is to capture information about the Pin Code situation for the labs.


The Labs have Pin locks.  You will need a Pin Code, key or escort to get into the labs. 

If you are not currently a Research Student, Research Staff or Research Associate, there is no reason you need or will be issued with a Pin code. 

The Pin Codes are allocated using the following logic.

Research Students

Each Lab has a generic Pin code for all the honours students using that Lab.  This code is changed each year to keep out the students from the previous year(s).

Each Lab has a generic code for all PhD students using that Lab. This code is changed when we need to keep out previous PhD students.

Individual pin codes are created for students who need access to multiple labs. These are deleted when the student completes their research.

NOTE - Please do not simply knock on my door every day to get access because you cannot manage to remember/write down/tattoo on your arm, your allocated pin code.

Research Staff

All full time research staff are issued with a Pin Code to access all the lab spaces.  These will be changed when that person ceases to be employed or moves to a non-research role.

Research Associates and Casual Staff

Short term, continuing or visting researcher and staff will be issued with an Individual Pin which will be configured to allow access to the areas they are required to work in.  This Pin will be deleted when they complete their project or their contract ends.

Research Participants

All research participants need to be escorted into and out of the labs by the researcher, student or RA that they are working with. 

DO NOT GIVE PIN CODES TO RESEARCH PARTICIPANTS.  EVER! 

They are your responsibility.  Escort them in, escort them out and call security if you think they are stealing from the labs.

Walk and Talk Tours, Visitors, Family Members, Groupies, Pets, Kids 

PIN CODES ARE NOT TO BE GIVEN TO ANY OTHER GROUPS.  DO NOT ASK.

Anyone visiting the labs does so under the escort of a student or staff member who has permission to be in the lab.  Your visitors are your responsibility. 

Tradespeople, Security and Facilities Staff

Security hold the only keys to the labs.
Tradespeople will be escorted on and off campus by security staff.
Facilities staff have a generic pin code which will be changed on a yearly basis.