An experimentation and mind-reading
application for the Emotiv EPOC
Submitted for two-semester independent work (COS 497 and 498).
Faculty Advisor: Ken Norman
Second Reader: Robert Schapire
This report describes the development and features of Experimenter, an application based on the EEG capabilities of the Emotiv EPOC headset. As a research tool, Experimenter allows a variety of experiments based on classic stimulus-presentation paradigms to be run using the Emotiv. Unlike most EEG setups, however, Experimenter not only records data but also attempts online analysis and classification of the incoming data stream. With the proper experimental setup, then, Experimenter can be used as a simple mind-reading application. Experiment and application design, sample procedures, classification techniques, results, and technical details are discussed.
This thesis represents my own work in accordance with university regulations.
Table of Contents
This report begins with a general overview of EEG (4.1) and previous work with the Emotiv headset (4.2 and 4.3). It then goes on to describe the Experimenter application itself. This includes a detailed description of the basic experiment format (5.3) and a summary of application functionality from the perspective of a user (5.4.2). The software design section (5.4.3) contains technical details regarding Experimenter’s implementation; this section is intended primarily for those who might wish to extend Experimenter or to build their own applications using Emotiv. The experiments section (6) describes specific experiments run as part of the current project, and contains instructions for implementing these experiments using the Experimenter application. The data analysis section details considerations and techniques for working with Emotiv’s EEG data (7.1) as well as various machine learning algorithms which were used for classification (7.2). This is followed by a summary of the results of applying this analysis to various collected data sets (8). Finally, I comment on my experience using the Emotiv headset (9.1) as well as future directions for this work (9.2). The walkthrough in appendix B (12.2) provides a visual tutorial for aspiring users of Experimenter, while the documentation in appendix C (12.3) is aimed at providing further assistance to developers.
The Emotiv EPOC headset is a cheap and easy-to-use mass-market EEG recording device. Targeted at video gamers, Emotiv promises to add a new dimension to the gaming experience by providing brain-computer interface (BCI) functionalities. Emotiv’s simplicity has also attracted the interest of neuroscientists. Unlike traditional EEG setups and functional MRI, the Emotiv headset offers the possibility of running inexpensive experiments from the comfort of one’s laptop. In the interest of exploiting these capabilities, I developed Experimenter, an application which can be used to collect and process EEG data within a configurable experimental framework.
Electroencephalography (EEG) is a nonintrusive means of recording brain activity. An array of electrodes attached to the scalp via conductive gel or paste is used to detect electrical activity produced by neurons firing within the brain. The resulting waveforms, popularly referred to as “brain waves”, can be used to diagnose epilepsy and some other abnormal brain conditions. EEG is also of interest to researchers because it can be used as part of an experiment to study the neurological activity following the presentation of specific stimuli or under specific conditions.
Emotiv headset is an attempt to bring EEG capabilities to the masses. At
$299.00 ($750.00 for the research SDK), the headset is relatively affordable.
More importantly, the headset’s fixed sensor-array makes it easy to use and
highly portable. Since it connects wirelessly to any PC via a USB dongle, the
user maintains a full range of motion. Instead of gel or paste, Emotiv simply
requires that the user wet its foam-tipped sensors with contact lens solution
prior to each use. The headset does pay a price, however, for such simplicity:
there are only 14 channels, less than ¼ the number used in many research setups
the headset is an EEG device, most developers using the device do not have
access to the raw EEG data stream (this is available only in the research
online application store offers a number of innovative EEG-powered programs
built on these suites. EmoLens tags users’ photos based on their emotions when
viewing them. NeuroKey acts as a virtual keyboard for the disabled which can be
interfaced with via both facial expressions and trained cognitive actions. The
Spirit Mountain game promises a true fantasy experience by allowing the user to
manipulate a virtual world with his or her mind. The headset has also become a
part of several interesting side projects, such as acting as a controller for a
superman-style human flight simulator and a robotic car
reviews of the device, however, are highly mixed. The facial
expression-detection system is said to work very well, but unsurprisingly what
reviewers really care about is the headset’s promise of mind-reading
capabilities. In this area, most reviewers seem to feel that the Emotiv’s
thought detection algorithms are at best unreliable, making it nearly
impossible to play even simple games like Pong
Zhou (COS ’11) worked with the Emotiv headset for a one-semester independent
work project last spring. Lillian ran several experiments with Emotiv, one
manually (running the Emotiv in the background to collect data while
instructing the subject to perform a task) and one with a simple integrated
presentation and recording loop using Emotiv’s native C++ API
This section describes the design and feature set of Experimenter. At the time of this writing, the full application source code, source-level documentation, and executable binary are available for download at http://www.princeton.edu/~madelson/experimenter.
The first iteration of the Experimenter application focused on supporting on supporting three basic features:
1. Allowing folders of images to be loaded into the application and treated as classes of stimuli.
2. Allowing the user to run an experiment where images from two such classes are displayed to the user.
3. Collecting, tagging, and logging the raw EEG data coming from the Emotiv headset while the images were being displayed,
These capabilities enabled simple stimulus presentation experiments (6.2.1), generating data for offline analysis (8.3.1). Implementing this basic application also provided insight into design challenges such as properly interfacing with the Emotiv headset and supporting a high degree of configurability. These issues were given special consideration when redesigning the code base for the final version of Experimenter.
While the ability to collect data via such experiments was useful, the initial experimenter application lacked the ability to process the incoming data online as it was being collected. The next iteration of the application thus focused on combining the application’s presentation and data-collection capabilities with a classification tool chain that could be run in an online manner. Online classification of EEG signals is of particular interest when working with the Emotiv, since the device is sufficiently user-friendly and portable to be used in practical BCI scenarios. Attempting online “mind reading” had the added benefit of making experiments more fun for subjects. This help to maintain attentiveness, which in turn may improve the quality of the collected data.
While any successful online classification could be considered mind reading, a truly interesting mind reading scenario should use this capability in a way that convinces the user that mind reading is really taking place. In other words, such a setup must enable the user, without looking at the application code, to believe within reason that the application really is relying on EEG data to make its predictions (as opposed to other sources of information to which it is privy). Such a setup can be said to achieve subject-verifiable mind reading. To more clearly illustrate this concept, imagine an application that displays either a face or a scene image to the user and then guesses at the type of image displayed based on online classification of collected EEG data. Assuming that the classifier obtains above-chance performance, this application possesses some basic level of mind reading/BCI capabilities. However, few skeptical users would be convinced of this, since it would be easy for the application to “cheat” because it knows the type of displayed image and thus could succeed at classification without ever examining the EEG output. There are other less-dishonest ways in which the mind reading illusion could be broken by poor experimental design. As discussed in section 7.1.1, the Emotiv headset is very sensitive facial muscle movements. If an experiment were to trigger such movements in a systematic way, it runs the risk of introducing a muscle-based signal that could allow the application to classify the incoming EEG data without actually using neuronal information. Thus, the application’s capabilities may be written off as something more akin to facial expression recognition rather than true mind reading. Furthermore, the data collected using such a design would be of limited offline use due to contamination by these so-called “motion artifacts”. Incidentally, this can be seen as a problem with Emotiv’s CognitivTM classification suite. Since users are given little guidance as to what to do when training the classifier to recognize a particular cognitive event, it is difficult to tell whether the resultant classifier really relies on the neuronal signal rather than using stronger muscle-based signals.
To achieve subject-verifiable mind reading then, it is necessary to present the user with a decision whose outcome cannot be known to the application except possibly through collected EEG data. However, this choice should also be structured in a way that does not invite subjects to accidentally contaminate their data with systematic muscle-based electrical activity. The experiment described in the following sections was designed to adhere to these guidelines.
Experimenter allows users to execute a highly configurable stimulus presentation experiment with support for real-time classification in addition to data collection. Furthermore, each stimulus display is followed by a simple interactive task to improve subject concentration.
The following table summarizes the key experimental parameters:
S1 and S2
Two classes of stimuli (images or text) used during the experiment. The stimuli in each class are classified as belonging to one of two specified subcategories.
S1 consists of 100 face images, where 50 are male and 50 are female. S2 consists of 100 scene images, where 50 are indoors and 50 are outdoors
The stimulus display time (in ms)
The number of training trials for each stimulus class
The number of test trials
C1, C2 … Cn
A set of classifiers to be used in online classification
C1 is an AdaBoost classifier (see 7.2.5) with decision stump (see 7.2.3) as the weak learner
The experiment proceeds as follows:
1. Assemble two lists of images, L1 and L2. L1 consists of the repeated concatenation of the set of stimuli in S1, with each concatenated set randomly shuffled. L2 is constructed in the same way using the stimuli in S2. Each experimental trial will consist of the presentation of one stimulus from each list, so both lists are of sufficient length so as to contain one stimulus for each training trial and each test trial in the experiment. The practice of exhausting all available stimuli in a class before allowing for any repetition rather than just randomly sampling from the stimulus set with replacement is desirable because a subject’s reaction to a stimulus may be different on the first viewing than on subsequent viewings. However, some experiments may have fewer stimuli than trials and thus require repeat presentations. In this case, shuffling each concatenated set of stimuli helps eliminate any potential bias due to the subject’s becoming aware of patterns in the presentation (e. g. that the same two stimuli are always presented together) while still maintaining a (probabilistically) long time period between repeated presentations of a stimulus.
2. Training Phase (repeat 2Ntrain times)
a. Let I1 and I2 be the next stimuli in L1 and L2, respectively. Let I* be randomly selected to be either I1 or I2, unless doing so would cause either I1 or I2 to be selected more than NTrain times (in this case the stimulus from the under-selected class is chosen to be I*).
b. Randomly assign one of the images to be displayed on the left side of the screen. The other will be displayed on the right.
c. On each side of the screen, display the class name of the image assigned to that side of the screen. In the center of the screen, an arrow indicates which side of the screen will contain I*. At this point, the subject should focus on the indicated side of the screen (Figure 7).
d. Replace the class names with fixation crosses, and display for a brief period of time (Figure 8).
e. Display the images I1 and I2 are on their respective sides of the screen for time T (Figure 9).
f. Present the subject with a display consisting of a question corresponding to the class of I* (e.g. Was the image of a male face or a female face? if I* was a face image). The subject may answer with either of the class’s two subcategories (e.g. male or female), or with a third option: I don’t know (invalidate trial). This option allows subjects to throw away trials where they accidentally looked at the wrong stimulus (Figure 10).
g. If the subject answered I don’t know (invalidate trial), or if the subject chose a subcategory that did not match I*’s subcategory, or if a motion artifact was detected during the trial (see 7.1.2), repeat the current trial with new stimuli (go to 2a).
h. Record the trial (go to step 2).
3. C1 through Cn are trained on the trial data collected during the training phase.
4. Test Phase (repeat Ntest times)
a. Let I1 and I2 be the next images in L1 and L2, respectively.
b. Randomly assign one of the images to be displayed on the left side of the screen. The other will be displayed on the right.
c. On each side of the screen, display the class name of the image assigned to that side of the screen. In the center of the screen, a bidirectional arrow indicates that the subject may focus on either side of the screen. At this point, the subject should choose one side of the screen (and thus one stimulus class) to focus on (Figure 11).
d. Replace the class names with fixation crosses, and display for a brief period of time.
e. Display the images I1 and I2 are on their respective sides of the screen for time T.
f. Use each classifier to predict which class of image the subject looked at. Display these predictions (Figure 12).
g. Present the subject with a display asking which class of image she focused on ().
h. Based on the subject’s response to g, present the subject with a display consisting of a question corresponding to the class of the selected image. Again, the subject may answer with either of the class’s two subcategories, or with I don’t know (invalidate trial) (Figure 12).
i. If the subject answered I don’t know (invalidate trial), or if the subject chose a subcategory that did not match the selected image’s subcategory, or if a motion artifact was detected during the trial, repeat the current trial with new images (go to step 4a).
j. Record the trial (go to step 4).
In addition to the parameters specified above, many other variables can be configured through Experimenter’s interface, including the size of the displayed images, the amount of time for which the fixation crosses and instructions are presented, and the rest period between trials. Online classifiers can also be configured to periodically retrain during the test phase in an effort to improve performance.
This experimental design can be seen as an instance of subject-verifiable mind-reading because the user makes a mental choice during the test phase, the result of which is not known by the application until after it has made its prediction. Although the need for the subject to view different sides of the screen might seem to be a source of motion-artifacts, any movement of this sort should only occur between trials and thus should not corrupt the data. Even if such corruption were to occur, the randomized placement of the images prevents this from being a source of systematic bias.
Experimenter also contains options to run several variations on the basic experiment. For example, rather than placing the images at opposite sides of the screen, the images can be superimposed with the top image made semi-transparent. The subject’s job, then, is to focus on one particular image. This feature also allows for experiments in which the task varies between conditions but the stimulus does not. To implement such a design, the same set of stimuli can be used for both classes with the experiment run in superimposed mode at zero transparency (as in 6.3). Another useful option is the ability to have Experimenter play a tone at the end of each trial. This allows experiments to be run where the subject does something other than look at the screen (as in 6.1). Experimenter also supports the use of text in place of images, which permits experiments regarding the subject’s response to particular words as well as text-based tasks. Finally, the policy of invalidating trials where the subject responded incorrectly to a question can be disabled, as can the asking of the question. This is useful when there is no definite answer to a particular question (e. g. “Would it be hard to illustrate the given text?”).
The trial timing supported by Experimenter is not millisecond-accurate. This means that a stimulus which is set to display for 500ms may display for a slightly longer or slightly shorter period of time (see 18.104.22.168). The data recorded during this time, however, maintains the time stamp generated by the Emotiv. Thus, the accuracy of the data time stamp is not subject to the inaccuracy of the display timing. To guarantee that a minimum amount of time’s worth of data is recorded, it is generally sufficient to simply pad the display time value by 50ms.
This section gives an overview of the Experimenter application’s capabilities and feature set. See appendix B (12.2) for a visual tutorial for the application.
Experimenter was designed with the purpose of being useful even to those with no desire or capability to modify its code base. This required making the program highly configurable yet (hopefully) easy to use. Furthermore, no attempt was made to replicate the capabilities offered by the Emotiv TestBench application which is provided with the research SDK and thus is presumably available to any user of Experimenter. Those features include real-time plotting of EEG data as well as a connectivity display, both of which are vital for getting the headset properly situated.
The Experimenter graphical user interface (GUI) is partitioned into three distinct regions, each of which allows for the configuration of one part of the experiment. The Experiment Settings region allows users to configure general experiment parameters, such as the image display time, as well as output options such as raw data output or an experiment log (Figure 13). The Classifiers region allows users to create and configure machine learning algorithms for online classification (Figure 14, Figure 15). Both classifier-specific parameters and general feature selection parameters can be adjusted. Multiple differently-configured classifiers can be run in parallel during an experiment to compare their performance. The region also contains a panel for configuring the artifact detection algorithm (Figure 14). When the headset is connected, artifact detection is performed in real time, giving the user feedback as he or she tunes the various parameters. Finally, the Classes region allows the user to create and configure classes of stimuli (Figure 16). Creating a new stimulus class from a folder of images and/or text files is simple. The class is created either by selecting its folder via the New button or by dragging and dropping the folder into the list of classes on the Classes tab (multiple folders can be loaded simultaneously in this manner). This action creates a new tab for the class, where various settings such as the class’s name and EEG marker (used to tag data when a stimulus from that class is being displayed) can be configured (Figure 17). The class’s tab also allows the folder’s stimuli to be viewed and either included or excluded from the class. The Classify button launches a tool for quickly specifying the subclass (e. g. male or female for face images) of each stimulus (Figure 18).
To make configuration easier over multiple sessions, each configurable region supports save and load functionality. This is especially valuable for large stimulus classes. Each stimulus can be tagged with a subclass once, and the result can then be saved to a special file in the folder where it will be accessed automatically when the class is loaded in the future.
This section details several major design choices and patterns in the Experimenter code base. It is intended both to justify the software design and to act as a high-level guide for those interested in using or modifying the code.
is written completely in the C# programming language which runs on Microsoft’s
.NET framework. The primary reason for choosing C# for this project is that
Emotiv officially supports a C# wrapper for its native C++ SDK. C#’s automatic
memory management and built-in support for graphical components make it
well-suited for building graphical desktop applications like Experimenter. However,
its use is restricted to .NET which in turn is generally restricted to the
Windows operating system. Currently, this is not an issue since Emotiv only
The research edition of the Emotiv SDK exposes an API for accessing the headset’s raw data feed. This interface exposes a strange mixture of event-driven and procedural access methods and is generally difficult to work with. Once the headset becomes connected, incoming raw data is placed in a fixed-size buffer until it is explicitly polled by the application. If this is done too infrequently, perhaps because lengthy processing (e. g. classification) occurs on the polling thread between polls, some data will be lost.
an attempt to make the Emotiv easier to work with programmatically, Experimenter
wraps the Emotiv API with an instance of the publish/subscribe design pattern
There are several advantages to this design. Because the polling thread of execution does not have to wait while each listener processes the incoming data, the risk of data loss is minimized. The ability to broadcast EEG data from one source to many listeners is valuable because Experimenter has several distinct consumers of this data. In the Experimenter GUI, separate listeners are used to implement real-time artifact detection and headset connection status; each listener is associated with a particular task and graphical component. While experiments are being run, several more listeners may be active to watch for a loss of connection, to log raw data, or to break incoming data into tagged trials for classification. Another benefit is the ability to substitute a “mock” data source that publishes a random signal in place of the Emotiv headset data source when the headset is not connected. This is vital for testing the application without constant use of the headset.
As the experiment grew more complex, so too did the requirements for its implementation. Since the experiment’s logic requires updating the display based both on time and user input, the simple timer-based implementation that was sufficient for the initial implementation was no longer sufficient. While the experiment logic remained fairly simple to describe sequentially with standard control flow constructs (e. g. if-then-else and loops), running such code directly within the GUI thread of execution would have prevented that thread from being able to respond to user input, such as a button press or an attempt to close the application window. The simplest alternative was to map the logic onto a sequence of GUI events, with timers and button presses triggering changes to the display as well as the creation of new timers and buttons which in turn would be used to trigger future changes. Such an implementation would have made the code extremely complex and unmaintainable, since it would lose all semblance of the sequential logic of the experiment.
issue was the proper management of various resources. Even though the .NET
framework provides automatic memory management in the form of garbage
collection, in some cases explicit management is still required. This is
because some managed objects hold access to resources such as open files,
images, and graphical components which are not “expensive” in the sense of
consuming substantial process memory but which do consume substantial underlying
system resources. The supply of such resources to a process is typically limited
by the operating system. Since there is no guarantee that the managed handlers
of these resources will be garbage collected in a timely manner (especially
since the memory footprint of the managed object may be very small), it is considered
good practice to explicitly dispose of these handler objects
Experimenter’s solution to these problems is a presentation system which makes heavy use of two powerful C# features. The yield keyword allows a function to return a sequence of values in a “lazy” manner. That is, rather than assembling the entire sequence and returning it to the caller, the sequence-generating code is suspended until the caller actually attempts to access an element in the sequence. At this point, the sequence-generating code executes until the next value is “yielded” to the caller, at which point it is once again suspended (in reality, the code executes normally but is transformed into a state machine by the compiler so as to provide suspension semantics). The using keyword provides a convenient way to guarantee the proper disposal of objects which require explicit cleanup. Resources which need to be disposed can be declared within a using statement and then used within the corresponding using block. When the program exits the scope of the using block in any way, the object will be disposed.
The other cornerstone for the presentation system is the notion of a view. A view is like a slide in a presentation: it is deployed to the screen at which point it may display text, images, or other graphical components. At some point an event causes the view to finish, at which time it and all its resource-intensive components are properly disposed of. This logic is implemented in an abstract class which can be extended to implement specific behavior, such as displaying images, prompting the user for input, or drawing a fixation cross.
Within this framework, an experiment can be implemented by a function which yields a sequence of views. This allows the logic of the experiment to be expressed sequentially, with resource-intensive elements scoped by using blocks. In reality, the code is executed within an animation loop which deploys each view to the GUI thread of execution where it is displayed (Figure 21). All of the complexity of multithreading is handled at the animation loop and abstract view level, keeping it absent from the main experiment code. In addition to supporting the experiment functionality, the presentation system is also used to implement Experimenter’s built-in tool for assigning subclasses to stimuli.
Allowing for a great degree of configurability via a graphical interface is challenging because each configurable parameter requires adding code for the appropriate type of input component (e. g. a text box) as well as a descriptive label and tooltip. Furthermore, reasonable default values must be provided and user input must be parsed and validated to ensure that it is within the allowed range. This sort of coding accounted for a large percentage of the work in developing the initial version of the application, so a different solution was required to make implementing the current version feasible.
Experimenter’s solution to this problem uses custom C# attributes, which allow code to be decorated with runtime-accessible meta-data. Rather than writing code to express the GUI logic for each configurable parameter, parameters are simply expressed as properties which are then decorated with Parameter or Description attributes. These attributes can be used to specify display names, descriptions, default values, and valid value ranges for the parameters. A custom component called a ConfigurationPanel can then use this information to dynamically construct a graphical component that allows the user to configure the parameters. Not only does this system save considerable programming time, but also it organizes the parameter specifications separately from the GUI logic, making the code neater and easier to modify (Figure 22). Usage of these attributes to annotate classes throughout the code base also allows many common functions to be implemented generically (i.e. in a way that allows them to operate on many types of objects). For example, Experimenter contains and uses generic functions for copying, instantiating, error-checking, and logging objects based on their parameters.
previously mentioned, Experimenter does not support millisecond-accurate timing
of presentation displays. This is due to the use of the System.Windows.Forms.Timer class, whose accuracy is limited to 55ms
While this level of inaccuracy may seem problematic, in most cases the accurate time stamping and tagging of the data is far more important than is the accurate timing of the display, since data collected at the end of a trial can always be ignored. To ensure accurate tagging relative to what is actually being displayed, Experimenter’s MCAEmotiv.GUI.ImageView class sets the Emotiv marker (used to tag the next data point collected) immediately after the display control has been deployed to the window, and sets the marker back to the default value immediately before removing this control. To maintain accurate time stamps for collected data, Experimenter simply uses the original time stamps generated by Emotiv.
for the most part Experimenter is based on original code, the .NET Base Class Library,
and the Emotiv SDK, it also makes use of open-source third-party software where
possible. The Math.NET Iridium library provides APIs for many common numerical
procedures, such as the fast Fourier transform (FFT) and basic matrix
Since the focus of this project was not to investigate the EEG response to a particular stimulus but instead to investigate the usage of the Emotiv headset for a general class of EEG experiments, I tried several different experiments over the course of the project. These experiments test both Emotiv and the capabilities of the Experimenter application.
whether a subject’s eyes are open or closed is one of the few EEG
classification tasks that can easily be performed by examining the voltage time
series with the naked eye. When a person’s eyes are closed, the EEG signals at
the occipital electrodes (channels O1 and O2 on Emotiv) exhibit distinctive
high-amplitude waves (Figure 23)
The experiment used the following format. During each training trial, the subject was presented with an instruction, either “Open your eyes” or “Close your eyes.” The subject then followed this instruction until a tone was played, since a visual queue does not work in the closed condition. The subject then responded to a question asking whether he or she had just opened or closed his or her eyes. If the response was not consistent with the instructions, the trial was invalidated. During test trials, the subject was given a choice as to whether to open or close his or her eyes during each trial. The following setup steps implement this design using Experimenter (only changes from default options are listed):
1. Create two stimulus folders, each containing one arbitrary stimulus. In this case, images of an open eye and a closed eye were used. Load the folders into Experimenter to create classes.
2. Label the open eye class as “Open your eyes” and the closed eye class as “Close your eyes”
3. Set the question for both classes to be “Were your eyes open or closed?” and the answers for both classes to be “My eyes were open” and “My eyes were closed.”
4. Classify the single stimulus in the closed class as “My eyes were closed” and the single stimulus in the open class as “My eyes were open.”
5. On the image display settings panel, select Beep After Display. Optionally set Width and Height to one to avoid actually displaying the placeholder stimuli.
6. In the experimenter settings panel, configure the experiment to use a relatively long Fixation Time and Display Time (e. g. 750ms and 1000ms respectively). This allows sufficient time to pass after the subject closes his or her eyes so that a steady state can be reached.
7. On the artifact detection panel, disable artifact detection.
classic experimental paradigm is to try and distinguish between the neural
processing of faces and scenes. The difference is known to be detectable, although
traditionally such experiments are done with fMRI rather than EEG
The initial version of the experiment supported only data collection as the subject viewed a series of face and scene images. The images were scaled to a size of 400 by 400 pixels and displayed in color for one second each, preceded by a randomly-timed fixation cross display and followed by a two-second break. There were several notable problems with this experiment. First, the experiment proved very tedious because of the lack of any requirement for user input. This made it very easy for the subject’s mind to wander. Second, the long display time gave the subject time to move his or her eyes around the image, which could have introduced motion-related noise. Worse, if the artifacts of the motion were systematically different for the two classes of images, this could have provided the classifier with an “invalid” means of classification. Finally, displaying the images in color may have added noise or even systematic bias to the data as a result of the subject’s response to particular colors.
The first mind-reading version of the faces vs. places experiment attempted to resolve these issues using various features of the Experimenter application. In this version, the subject was presented with superimposed gray scale images from two Experimenter stimulus classes: “Faces” and “Places”. During the training phase, subjects were directed to focus on either the face image or the place image before viewing the superimposed pair. During the test phase, the subject was allowed to choose mentally which image to focus on prior to the seeing the display. After viewing an image, the subject had to determine whether the image depicted a man or a woman (for faces) or whether it showed an indoor scene or an outdoor scene (for places). Images were displayed for 500ms (shorter display times of 200-250ms were also attempted). Typically, 50 training trials per stimulus class were used along with 100 test trials. Both myself and a test subject found this experiment to be frustrating because it was often difficult to avoid viewing both images (particularly since faces are very salient). Perhaps because of this, the classifiers I tried performed poorly under these conditions. The following setup steps implement this design using Experimenter (only changes from default options are listed):
1. Create two stimulus folders, one containing faces and one containing places. Load the folders into Experimenter to create classes.
2. Label the classes appropriately, and use the classification tool to assign male/female subclasses to the face images and indoor/outdoor subclasses to the place images.
3. On the image display settings panel, select Superimposed and tune the Alpha (transparency) parameter until both superimposed images are suitably visible.
The final iteration of the faces vs. places experiment motivated the development of the basic “side-by-side” mode in Experimenter. In this version, the face and place images in each trial are displayed on opposite sides of the screen. During the training phase, the subject is directed to look at one side of the screen, and during the test phase he or she is allowed to choose a side to look at. Following setup steps one and two for the previous experiment implements this design using Experimenter.
experiment tested Emotiv and Experimenter with another classic paradigm: artist
Generate a long list
of concrete nouns, perhaps using the MRC Psycholinguistic Database
2. Create two stimulus folders, each containing a copy of this text file. Load the folders into experimenter to create the classes.
3. Label the classes “Artist” and “Function”, and specify their questions and answers appropriately. Leave all stimuli as unclassified.
4. On the image display settings panel, select Superimposed and set Alpha to zero or 255. This guarantees that only stimuli from one class will show (this is fine since the classes use the same stimuli).
5. On the experiment settings panel, set the Display Time to 3000ms and the Question Mode to Ask.
Like the artist vs. function experiment, this experiment attempts to distinguish between two very different types of processing. Subjects are presented with two superimposed stimuli: an image of a face and a simple mathematical expression. When directed to focus on the face, subjects must determine whether or not the person is male or female. When directed to focus on the expression, the subject must attempt to evaluate the expression in the given time. A display time of two or more seconds was used to give the subject sufficient time to at least make progress evaluating the expression. Since Experimenter currently supports only two possible subclasses per stimulus class, all expressions needed to evaluate to one of two values. In an attempt to keep processing as uniform as possible, I attempted to avoid answer values that might allow shortcuts to some expressions. For this reason, I chose 14 and 16, since these two values have the same parity and are close in magnitude. The following setup steps implement this design using Experimenter (only changes from default options are listed):
1. Create a text file with one mathematical expression on each line. This can easily be done with code such as (Figure 25).
2. Place this text file in a stimulus class folder and load the folder into the application. Configure the class’s Answer values to match the possible expression values.
3. Generate and load a face stimulus class as previously described.
4. On the image display settings panel, select Superimposed and configure the Alpha value for the image so that the text is clearly visible but does not dominate the image (I found that a value of 200 was appropriate).
5. On the experiment settings panel, set the Display Time to 2000ms.
emphasized by the success of Emotiv’s AffectivTM suite, EEG signals
can be used to detect the movement of facial muscles. Indeed, the detection of
the so-called “artifacts” that result from these motions is critical in
processing EEG signals because the powerful motion-related noise swamps the
weaker neuronal signal (Figure 24). The most
notorious such artifact is the eye blink, which causes a large deflection of
the signal that is typically 20 dB above background activity
algorithm is based on one used in an EEG study by Ehren Newman
the alpha and threshold values used by Newman (0.025, 40, 0.5, and 40)
generally work quite well with the Emotiv, these are the default values used by
Observation of the online FFT output provided by the Emotiv TestBench application suggests that motion artifacts could also be detected by watching for power surges at low frequencies in the signal’s Fourier transform. Since the algorithm described above seems to be sufficiently effective, however, I did not explore this possibility further.
Data collected by the Emotiv headset, or indeed by any EEG setup, is in the form of a collection of a real-valued voltage time series (one for each scalp electrode). When considering how to analyze and classify such data, there are two distinct possible approaches. The first is a “steady-state” approach. With this approach, one attempts to develop a classifier which can categorize an arbitrary subsection of the time series. Since obviously the same signal may look quite different depending on where the sampling “window” is placed, some sort of preprocessing is required that can extract a set of features that are mostly independent of window placement. Such features can be said to represent steady-state properties of the EEG signal, and can thus be compared across samples. Possible options for such features include the mean voltage of the signal over the window, the average difference of the signal voltage from that mean, and the power spectrum of the signal.
alternative approach is to keep time as a factor in the equation. In many
cases, one wishes to classify the EEG response to some stimulus. Thus rather
than building a classifier that operates on an arbitrary subsection of the time
series, one can instead design a classifier to work with samples of data which
have been aligned based on the time of the stimulus presentation. For example,
if one collects many 500ms samples of data, each of which is time-locked to the
presentation of some stimulus, then the voltage data at 200ms post stimulus onset can be treated as a
feature that can be compared across samples. The argument for using this
approach instead of the steady-state approach is that, in many cases, distinct
responses to certain stimuli have been observed to occur at predictable time
points post stimulus onset
data is extremely noisy, which makes classifying it correspondingly difficult.
Some sources of noise are external. These include the aforementioned motion
artifacts and noise introduced by the EEG collection device. One controllable
source of external noise is the alternating current in the electric mains
classification in an online manner is more difficult than offline
classification for several reasons. First, the classifier must train relatively
quickly. While this is not an issue for many learning algorithms, it may rule
out some procedures, such as training a classifier many times on different
feature sets or with different random seeds (in the case of an algorithm with a
random element), which are used in offline studies
is also important to consider how the performance of offline and online
classifiers is validated. A popular offline means of assessing classifier
performance is leave-one-out or k-fold validation
The most straight-forward approach to EEG analysis is to use the raw signal voltages directly. Since the collected data is never perfectly aligned with the onset of the stimulus (both because the data resolution is finite and because the combination of timers and multithreading means that exact presentation time cannot be guaranteed), it is often helpful to down-sample the time series so that each data sample has the same sequence of time bins. This also helps reduce the effect of very high frequency noise. I typically used down-sampling to reduce the resolution to 20ms instead of the native 7.8ms; this does not noticeably alter the overall shape of the time series. Down-sampling also reduces the dimensionality of the data, which is very important for some classifiers and nearly always decreases training time.
common preprocessing step is to subtract the sample mean from each channel time
series (effectively setting the mean to zero). The goal of this is to reduce
the impact of low-frequency drift in classification. Of course, this also risks
the possibility of removing useful information; perhaps some stimuli induce a
general increase or decrease in voltage along a channel. One solution to this
problem is to expand the sample to include some amount of time before and after
the onset of the stimulus for the purposes of calculating the mean
classic approach to signal processing is to transform the signal from the time
domain to the frequency domain. The frequency domain representation of a signal
can prove more tractable to analysis than the time domain representation both
because signals are often well-characterized by one or more frequencies (e. g.
musical notes) and because sources of noise whose frequencies are different
than the relevant frequencies in the signal (e. g. the electric mains) will be
effectively separated out by the transformation. In working with EEG, the goal
of frequency analysis is typically to determine the signal’s power spectrum,
that is, the power at each frequency. This can be computed via the (discrete)
Fourier transform, which breaks a signal into its constituent frequencies.
However, some signals, such as EEG, have frequency domain representations that
change over time. In these cases, an approach such as the short-time Fourier
transform or a wavelet transform can be used to capture both time and frequency
Over the course of the project, I tried using both discrete Fourier transform- and wavelet-based preprocessing to improve the performance of my classifiers. In general, I found that use of the Fourier transform did not improve classification. Wavelets seemed to be a more promising approach, but here my progress was stymied by the lack of available wavelet libraries for .NET. The one implementation I did find was extremely limited in that it did not allow the frequencies of the wavelets to be specified and only processed a small number of frequencies. Applying this transform to the sample time series’ also failed to improve classifier performance (in fact, performance typically decreased).
Spence, Gerson, and Sajda suggest that, due to the Ohmic nature of tissue, EEG
activity can be accurately described with a linear forwarding model that
relates current sources within the brain to surface potentials
found that this method was insufficiently powerful to classify the data
collected with the Emotiv. This is likely due to Emotiv’s lack of electrodes,
since the channel count is equal to the number of features available to the
discriminator (the data used in Parra et. al.’s study had 64 channels)
Experimenter supports online classification, it was vital that the application
have access to machine learning algorithms from the .NET framework. One
disadvantage of working with the .NET platform is the lack of open-source
libraries providing access to implementations of machine learning algorithms. I
considered using the .NET NPatternRecognizer library, but ultimately decided
against it because of the clumsy APIs and messy, incoherent source code (which
is commented in Chinese and contains a great deal of dead/commented out code)
k-nearest neighbors algorithm is a simple classifier which operates by finding
the k examples in the training set which are closest to the test example by
some metric (e. g. Euclidean distance).
A prediction is then computed as the most common class among this set of
“nearest neighbors”, with each neighbor’s vote assigned a weight inversely
proportional to that neighbor’s distance from the test example. The naïve
implementation of this algorithm has the advantage of being extremely quick to
train, since training consists only in saving a copy of the training set. I
implemented this algorithm primarily because it was used by Lillian Zhou in her
previous work with Emotiv
· K: the number of nearest neighbors used in the voting process.
· WeightedVoting: a Boolean value which determines whether or not all votes are weighted by the voter’s Euclidian distance to the test example (as opposed to being given equal weight).
regression is a classification technique that seeks to fit the training set to
a logistic curve by projecting it onto one dimension and passing the result to
a logistic function
· λ: the penalty term.
· MinDistance: the epsilon value used to determine when the weights have converged in the iterative algorithm.
· MaxIterations: an optional upper bound on the number of iterations performed.
decision stump is an extremely simple classifier which uses only a single
feature of the data. In the training stage, the algorithm chooses a cut-point.
Classification of test examples is determined by testing whether the selected
feature in the example falls above or below the cut-point
· Feature: the index of the feature to be used by the algorithm.
voted perceptron algorithm is conceptually similar to more complex support
vector machine algorithms in that it works by (possibly) mapping the data into
a very high dimensional space and then locating the hyperplane which separates
the two classes and which maximizes the distance to the nearest training data
· Epochs: the number of iterations performed over the training set.
· Kernel: the kernel function used by the algorithm. Individual kernels may have their own configurable parameters.
is an ensemble learning algorithm: it derives its predictions by intelligently
combining the predictions of a group of “weak learners”, other classifiers
trained as part of the AdaBoost training process
· Rounds: the number of weak learners trained by the algorithm.
· WeakLearnerTemplate: the type of classifier to be used as a weak learner. This classifier must be able to work with arbitrarily weighted training examples.
· WeakLearnerTrainingMode: this setting determines which subset of features is used in training each weak learner. For example, the AllFeatures setting trains each weak learner using every feature of the data, while the RandomFeature setting trains each weak learner on a single random feature of the data (this is useful when the weak learner is a decision stump).
classifying a time series using their linear discriminant analysis approach to
classification, Parra et. al. simply average the output of the discriminator at
for all time points in the series in order to formulate a prediction for the
This section details the numerical results obtained from various experiments. Aside from the eyes open vs. eyes closed experiment where over 90% accuracy was achieved, most of my “successful” results fell within the 60-70% range. While such results may not seem impressive it is important to consider them in the context of the noisy, 14-channel data stream from which they were obtained as well as the complexity of the underlying mental states that these experiments attempted to differentiate.
Since the expected EEG response to the eyes closed condition is a steady state response rather than a time-sensitive ERP, voltage-based classification that compares the voltages at particular time bins under each condition is generally ineffective despite the high separability of the two conditions. For this reason, the average amplitude of the signal (approximated by the average difference from the signal mean), rather than the voltage time series, was used as input to the classifier. Furthermore, only one occipital channel’s worth of data was used (either O1 or O2). A simple decision-stump classifier trained on this single-channel “amplitude” proved very effective (> 90% accuracy) at differentiating between the two conditions online even with very few (10 – 20) training trials. Most classifier failures could be attributed to eye-motion artifacts. For example, sometimes holding one’s eyes open causes them to feel very dry, leading to a fluttering of the eyelids or even a blink. Such artifacts can sufficiently distort the average amplitude to fool the simple classifier.
Two easy steps might be able to improve the classifier performance on this task to very near 100%. First, tuning the artifact detection algorithm to prevent its triggering during eyes closed trials would improve accuracy by allowing it to detect and eliminate most, if not all, erroneously classified trials (the above results were obtained with artifact detection disabled to avoid otherwise frequent false positives). Another option would be to classify based on the power spectrum of the occipital signals rather than the amplitudes, since this would preserve more discriminatory information and would likely be more robust to motion artifacts. These improvements were not pursued, however, since the current performance is indicative of Emotiv/Experimenter’s ability to perform very well on this task.
The offline accuracy results described in the subsequent sections were computed in the following manner:
1. Reject all trials with motion artifacts using the default motion-artifact detection parameters. This was done because in some cases the experiments were run without artifact detection enabled.
2. For each trial, down-sample the time series in that trial to a resolution of 20ms.
3. For each channel in each trial, subtract the channel mean from each data point. Append this channel mean to the resultant series.
4. Concatenate the feature sets from each channel into a single feature set.
5. For each data set, partition examples into to two collections, one for each class.
6. Randomly shuffle each collection of examples.
7. Reserve the first 10% of the examples in the first collection for the test set. Reserve the same number of examples from the other collection for the test set as well. Thus, chance performance of a classifier on the test set is always ½.
8. Concatenate and then shuffle the remaining examples to form the training set.
9. Train each classifier on the training set.
10. Record each classifier’s accuracy in classifying the test set.
11. Repeat steps 6-10 100 times, and recording the average accuracy of each classifier.
Note that this process is similar to ten-fold validation except that it runs ten times as many tests and is randomized. This process was performed using the following classifiers:
· AdaBoost VP:
o Algorithm: AdaBoost
o Rounds: the number of features per channel
o WeakLearnerTrainingMode: each weak learner was trained on a random time bin’s worth of data where one “bin” actually consisted of the 14 channel means.
o WeakLearnerTemplate: Voted Perceptron
§ Epochs: 4
§ Kernel: Polynomial with degree 3
· AdaBoost DS:
o Algorithm: AdaBoost
o Rounds: the total number of features
o WeakLearnerTrainingMode: each weak learner was trained on a single random feature
o WeakLearnerTemplate: Decision Stump
o Algorithm: Penalized Logistic Regression
o λ: 10-4
o MaxIterations: 50
o MinDistance: 10-5
o Algorithm: K-Nearest Neighbors
o K: 20
o WeightedVoting: Yes
These classifier parameters were chosen based on my experience working with these classifiers over the course of the project. A thorough parameter scan could surely improve performance of at least some of the classifiers at the cost of a great deal of computation time.
The following chart (Figure 1) shows the performance of the four classifiers on three version 1 data sets. The first set is from an experiment where the two stimulus classes were faces and manmade objects, while the second and third sets used faces and outdoor scenes.
Figure 1 Version 1 results
Clearly, the face vs. scenes data classified significantly better than the faces vs. objects data. This could be a fluke, or it could be indicative of the fact that faces vs. scenes is not equivalent to faces vs. non-faces (i. e. the particular choice of places matters as well). For the faces vs. scenes data sets, the PLR classifier had the best average performance.
Figure 2 shows the poor classifier performance on data generated in the superimposed version of the faces vs. places experiment. This is consistent with the observed online results which led to the creation of the side-by-side version of the experiment.
Figure 2 Version 2 results
Figure 3 shows the performance data for a number of data sets collected using version 3 of the experiment with myself and one other subject.
Figure 3 Version 3 offline results
Many of these data sets classify very well, with the PLR classifier breaking the 80% mark on one set and the 70% mark on several others. The data set that achieved the best performance was one where I experimented with using a smaller image size than normal (200x200 instead of 400x400). This makes it even easier for the subject to avoid moving his or her eyes. The observed online performance results for this experiment were slightly worse than the offline results (60-66%), although typically the online classifiers used in these experiments were not as well-tuned as the classifiers used in the offline analysis.
To better understand online performance, then, I simulated online classification with a periodic retraining every ten trials for each data set. Thus, the classifiers were trained on the first ten trials and judged on their ability to predict trials 11 through 20, then trained on the first 20 trials and judged on their predictions for trials 21 through 30 and so on. The accuracy score at each trial presented here is a running score taking all predictions so far into account (this is the same score displayed by Experimenter).
Figure 4 Version 3 PLR online accuracy
The result of this simulation was not surprising. Accuracy steadily improved before eventually leveling off at a value comparable to that found in the offline performance tests (Figure 4, see Figure 26, Figure 27, and Figure 28 for the results for the other classifiers). The one exception to this trend was the KNN classifier, which maintained roughly constant accuracy throughout on most data sets. For the other classifiers, though, these results unfortunately suggest that nearly 100 training trials are necessary before online classifier performance reaches its peak. While such experiments could be run, in many cases it seems worthwhile to sacrifice some online performance to reduce the number of training trials, creating a more enjoyable experience for the subject (since the mind reading aspect of the experiment only comes online in the test phase).
Despite a preliminary online accuracy result of over 60% using a classifier very similar to AdaBoost VP, in general the data collected during the artist vs. function experiment proved rather resistant to classification. Figure 5 shows three attempts to classify artist vs. function data sets from two subjects using different feature-selection techniques:
Figure 5 Artist vs. function results
The first pair of results used the standard feature selection routine discussed above, but explicitly took only the first 500ms of each trial (this truncation was done to keep classifier training times reasonable, especially for PLR). All classifiers performed poorly under these conditions. The second pair of results used only the second 500ms of each trial. This was motivated by the fact that both tasks require a willful action to initiate, which likely causes the response to be slower than for something more automatic like image processing. Whether or not this is the case, the data from this later time period proved more classifiable, resulting in moderately good performance (especially on the first data set). Finally, the third pair of results was obtained by training the classifiers on the power spectrum (as computed by the FFT) of each trial. This was attempted because the subject’s performing the artist or function task seemed more like a steady-state phenomenon than a time-sensitive response to a stimulus. However, the poor classifier performance under these conditions suggests either that this is not true or that more sophisticated time-frequency analysis methods are required.
As with the artist vs. function experiment, I was able to obtain some relatively good online results (60-65% accuracy) for this experiment, but found classifying data collected for offline analysis to be very difficult. Figure 6 shows three attempts to classify faces vs. expressions data sets from myself and another subject using the previously described trio of feature selection techniques (except that rather than using the 500-1000ms time period for the second pair of results I used the 200-700ms time period):
Figure 6 Faces vs. expressions results
In general, classification was mediocre at best under all conditions. Furthermore, unlike with the artist vs. function data, the two data sets seem to behave differently from each other under the various feature-selection regimes.
The results described above demonstrate that the data collected with Experimenter can often be effectively classified, particularly in the eyes open vs. eyes closed and the faces vs. places cases. The data from the artist vs. function and faces vs. expressions experiments proved more difficult to classify, although this may be partially attributable to the fact that, due to time constraints, I only ran a few instances of each experiment, granting little opportunity either to tune the experiment or to train the subjects (unlike the faces vs. places experiment, these experiments require a degree of mental discipline).
Of the machine learning algorithms considered, PLR proved to be the most reliably effective at classifying Experimenter data (at least given the current set of classifier parameters). However, PLR was also by far the slowest classifier to train, taking five to ten seconds with 200 examples and 500ms worth of data (25 features/channel after down-sampling). The other algorithms generally took less than one second under the same conditions. This is especially troubling when using PLR for online classification. Interestingly PLR’s training time is generally dominated not by the size of the training set but by its dimensionality (at least with fewer than 1000 examples). The algorithm I implemented must invert an n x n matrix on each iteration, where n is one greater than the example dimension. This can be a problem, since Emotiv’s EEG data has hundreds of features even for a short trial. For this reason, most of the algorithm’s training time is spent inside calls to Math.NET’s matrix inversion routine. As the number of features increases (perhaps from using longer trials), the PLR algorithm quickly becomes effectively unusable. Thus, to use PLR one must be more careful to reduce the size of the feature set than with the other algorithms. This can be done through trial and error, through leveraging experiment-specific neurological knowledge (e. g. the time of a known ERP), or through increased down-sampling.
The development of Experimenter necessitated substantial use of the Emotiv headset. This experience allows me to speak to the product’s usefulness as a BCI and EEG research device. While Emotiv’s low cost, portability, and relative ease of use are obviously exciting in that they simplify the process of executing experiments, perhaps what will prove even more useful to researchers is the device’s potential to greatly expedite the experimental design and development process. With Emotiv, experiment designers can easily test their designs on themselves and collaborators long before running an expensive formal experiment. It would also be interesting to see researchers collaborate with Emotiv and Emotiv application developers to collect data on a larger scale. For example, Emotiv applications could give users the option of logging their EEG activity and periodically uploading it to a central database where it could be anonymized and made accessible to researchers (many commercial applications already upload usage data in this way for product improvement purposes). Such data would obviously be of lower quality than that collected under controlled conditions in a laboratory, but presumably its sheer volume and low cost would make it of great interest nonetheless. Of course, implementing such a scheme would require Emotiv to at least partially relax its restrictions on making raw data available to applications (since this is currently available only through the research edition). Presumably though, this feature could be implemented so as to maintain Emotiv’s hold on the real-time raw data stream and/or so as to provide Emotiv with a financial incentive (researchers might pay for access to the data). At the very least, one would assume that Emotiv might wish to use the raw data collected in this way to improve its own classification suites.
Despite the many exciting possibilities, though, there are some distinct pitfalls one can encounter when the Emotiv headset. One issue is that the incoming data seems to become increasingly noisy as the battery nears empty. This is particularly frustrating since the battery meter in the Emotiv TestBench application does not seem to work properly and thus the problem can easily go unnoticed. For researchers hoping to collect valid data using Emotiv, this means that it is necessary to recharge the battery frequently to avoid corruption. Also, although setting up Emotiv is obviously far easier than for a traditional EEG setup, properly aligning the device so as to get good connectivity on all electrodes can be difficult and frustrating, especially for new users. Practice with the headset also seems to be important for obtaining the best experimental results; in general I observed that myself and the one other subject whom I worked with most tended to produce more classifiable data than did other subjects. This is likely caused by acclamation to the experimental setup. A user’s learning to maintain both focus and a consistent cognitive approach to each experimental task (e. g. evaluating a mathematical expression) during the experiment likely reduces the noise present in the collected data.
There are a multitude of possible experiments which one could attempt using Emotiv and Experimenter. For example, one might try to distinguish between a subject’s response to other classes of stimuli, such as emotional and unemotional images. Another classic paradigm compares the subject’s response to familiar versus “oddball” images. This could be implemented by creating one stimulus class with a small number of images and another with a large number of images. A variety of two-task experiments similar to the artist versus function experiment are also possible. Finally, the freedom of motion creates the possibility of developing experiments where the conditions do not solely depend on looking at the screen, but instead attempt to detect the user’s engagement in some other sort of activity.
The Experimenter application could be improved by adding a greater degree of configurability to the experiment setup and classification suites. For example, the experiment could be made more general by supporting a wider variety of tasks (currently only binary choice tasks are supported). The ability to input arbitrary integers, for example, would allow Experimenter to use a variety of counting and calculation tasks. Experiment designers may also wish to be able to directly manipulate probabilities used in the experiment. In particular, the oddball experiment would work best if the oddball category of images could be made to appear less frequently.
Increasing the number of available preprocessing options would be a good first step towards improving Experimenter’s online classification capabilities. For example, it would be beneficial if users could select from a variety of additional data features including channel amplitudes and time-frequency representations when constructing the input to a classifier.
Python with Experimenter would do much to improve the application’s data-processing
capabilities. IronPython is an open source implementation of the popular Python
programming language that is fully integrated with the .NET framework
There are also many interesting possibilities when one considers building other sorts of applications based on Experimenter’s code base for interacting with the Emotiv headset and with Emotiv data. These capabilities, along with Experimenter’s flexible presentation system, provide programmers with useful tools for rapidly developing Emotiv-based applications which are very different in nature than Experimenter. To demonstrate this concept, I constructed two simple applications which use Experimenter’s core code in slightly different ways. Using Experimenter’s presentation, classification, and Emotiv interoperation suites, I was able to build each of these applications in about half an hour with roughly 100 lines of C# code.
A great deal of work has gone into developing algorithms for classifying images (e. g. face detection), something which the human brain naturally excels at. Thus, I thought it would be interesting to build an application whose purpose was to categorize images based on a viewer’s EEG response. The application begins with a stimulus class containing two categories of images, such as faces and scenes. Only a few images of each type have been categorized; the rest are unmarked. The application then begins displaying the classified images to the user in rapid succession (400ms each) until each image has been displayed ten times. The collected responses to each image are then averaged, and a classifier is trained on the resultant trial-averaged signals. Then, the application repeats this process with the unclassified images. Whenever ten artifact-free responses are collected for a single image, the application attempts to classify that image based on the trial averaged response. It then retrains the classifier, including the newly classified image in the training set (the user either confirms or denies the correctness of the classifier’s prediction). This continues until all images are classified. When I tested the image classifier using an AdaBoost/voted perceptron classifier, the application achieved 78% accuracy while classifying 42 face and scene images, beginning with only 5 marked images of each type.
Unlike Experimenter and the Image Classifier, which rely on time-locked data to perform classification, the Thought-meter explores the possibility of steady-state classification. This application first attempts to learn to distinguish between a “neutral” condition and some user-chosen condition (e. g. thinking about a specific topic) by recording 20 seconds of data under each condition and before training a classifier. The application then displays a counter to the user which behaves in the following manner: every time new data is received from the Emotiv, the last 500ms of data are fed to the classifier. When the user-defined condition is detected, the counter increases by one, and when neutral condition is detected, the counter drops by 1.5 to a minimum of zero. The user can then attempt to move the counter away from zero (which is difficult due to the bias towards neutral) using his or her mind. A test found the thought meter to succeed in differentiating between watching a movie and staring at a blank screen as well as between steady-state smiling (steady-state because the initial action of smiling triggers the artifact detection algorithm) and holding a neutral expression. However, I had little success with getting the current implementation to differentiate between purely cognitive conditions.
This work resulted in a functioning, full-featured, and well-documented application which is capable of implementing a variety of interesting experiments using the Emotiv headset. The application supports both data collection for offline analysis as well as exciting online classification capabilities. These abilities were tuned through an exploration of various preprocessing and machine-learning algorithms, the results of which are presented here.
Another benefit of this work to future students and researchers is its development of a robust, high-level API for interacting with the Emotiv headset. This API lowers the effort bar for building interesting applications that work with raw EEG data, rather than just with Emotiv’s built-in classification suites. By using an open classification and data processing framework rather than limiting themselves to Emotiv’s classification APIs, developers could benefit from the substantial knowledge and experience of the neuroscience community while also creating applications that could be used to further our understanding of the brain.
I would like to specially thank several people who contributed to the success of this project:
· Lillian Zhou, for providing me with all of her data, code, and write-ups from her previous independent work with Emotiv.
· Jarrod Lewis-Peacock, for providing me with a huge number of stimulus images for my experiments.
· Annamalai Natarajan, for helping me learn to use Emotiv properly, for sharing information on related research, and for making a valiant effort to get the running room calendar to allow me to reserve time for my experiments.
· Nick Turk-Browne, for taking time to meet with me and share his expertise on stimulus presentation techniques.
· Matt Costa, for volunteering as a subject and giving usability feedback.
· Heather Stiles, for patiently test-driving numerous buggy iterations of the Experimenter application and for providing data and feedback.
· Ken Norman, for providing enthusiasm, assistance, time, and advice.
Math.NET Numerics. (2011). Retrieved 2011, from Math.NET: http://numerics.mathdotnet.com/
Chang, C.-C., & Lin, C.-J. (2011). LIBSVM - A Library for Support Vector Machines. Retrieved 2011, from http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Coles, M. G., & Rugg, M. D. (1996). Electrophysiology of Mind. Oxford Scholarship Online Monographs.
Dakan, R. (2010). Review: Emotiv EPOC, tough thoughts on the new mind-reading controller. Retrieved 2011, from Joystiq.
Emotiv. (2009). Extracting RAW data and other applications. Retrieved 2010, from Emotiv Community: http://www.emotiv.com/forum/forum4/topic401/
Emotiv. (2010). Comparison with other EEG units. Retrieved 2011, from Emotiv Community: http://www.emotiv.com/forum/forum4/topic127/
Emotiv. (2010). Developer Edition SDK. Retrieved 2011, from Emotiv Store: http://www.emotiv.com/store/sdk/bci/developer-edition-sdk/
Emotiv. (2010). Electrode Placement. Retrieved 2011, from Emotiv Community: http://www.emotiv.com/forum/forum14/topic255/
Emotiv. (2010). mac support? Retrieved 2011, from Emotiv Community: http://www.emotiv.com/forum/forum4/topic73/
Emotiv. (2010). User guide by users! Retrieved 2011, from Emotiv Community: http://www.emotiv.com/forum/forum4/topic126/
Freeman, E., Freeman, E., Bates, B., & Sierra, K. (2004). Head First Design Patterns. O'Reilly Media.
Freund, Y., & Schapire, R. E. (1999). Large Margin Classification Using the Perceptron Algorithm. Machine Learning, 277-296.
Frijters, J. (2010). IKVM.NET. Retrieved 2011, from http://www.ikvm.net/
Hanke, M., Halchenko, Y. O., Sederberg, P. B., Hanson, S. J., Haxby, J. V., & Pollmann, S. (2009). PyMVPA: A Python toolbox for multivariate pattern analysis of fMRI data. Neuroinformatics, 37-53.
Jackson, M. (2010). Painting Your Own Tabs - Second Edition. Retrieved 2011, from The Code Project: http://www.codeproject.com/KB/tabs/NewCustomTabControl.aspx
Johnson, M. K., Kounios, J., & Nolde, S. F. (1997). Electrophysiological brain activity and memory source monitoring. NeuroReport, 1317-1320.
Liu, F. (2009). NPatternRecognizer. Retrieved 2011, from CodePlex: http://npatternrecognizer.codeplex.com/releases/view/37602
McDuff, S. G., Frankel, H. C., & Norman, K. A. (2009). Multivoxel Pattern Analysis Reveals Increased Memory Targeting and Reduced Use of Retrieved Details during Single-Agenda Source Monitoring. The Journal of Neuroscience, 508-516.
Microsoft. (2011). IDisposable Interface. Retrieved 2011, from MSDN: http://msdn.microsoft.com/en-us/library/system.idisposable.aspx
Microsoft. (2011). IronPython. Retrieved 2011, from http://www.ironpython.net/
Microsoft. (2011). Timer Class. Retrieved 2011, from MSDN: http://msdn.microsoft.com/en-us/library/system.windows.forms.timer.aspx
Newman, E. (2008). Testing a model of competition-dependent weakening through pattern classification of EEG. Princeton University.
Newman, E. L., & Norman, K. A. (2010). Moderate Excitation Leads to Weakening of Perceptual Representations. Cerebral Cortex Advance Access, 1-11.
Novell. (2011). Cross platform, open source .NET development framework. Retrieved 2011, from Mono: http://www.mono-project.com/Main_Page
Parra, L. C., Spence, C. D., Gerson, A. D., & Sajda, P. D. (2005). Recipes for the linear analysis of EEG. NeuroImage, 326-341.
Root, M. (2011). Disability and Gaming Part 3: Hardware And Where It’s Going. Retrieved 2011, from PikiGeek.
Russel, S., & Norvig, P. (2003). Artificial Intelligence: A Modern Approach. Upper Saddle River: Pearson Education Inc.
Schapire, R. E. (2003). The Boosting Approach to Machine Learning. Nonlinear Estimation and Classification, 1-23.
Solon, O. (2011). Mind-controlled flying rig lets you be a superhero. Retrieved 2011, from Wired: http://www.wired.co.uk/news/archive/2011-03/03/infinity-simulator
Squatriglia, C. (2011). Thinking Your Way Through Traffic in a Brain-Control Car. Retrieved 2011, from Wired: http://www.wired.com/autopia/2011/03/braindriver-thought-control-car/
University of Waikato. (2011). WEKA. Retrieved 2011, from http://www.cs.waikato.ac.nz/ml/weka/
Wilson. (1988). MRC Psycholinguistic Database: Machine Readable Dictionary. Behavioural Research Methods, Instruments and Computers, 6-11.
Zhou, L. (2010). The Emotiv EPOC Headset: the feasibility of a cheaper, more accessible neural imager. Princeton: Princeton University Independent Work.
Figure 7 Training phase instruction display
Figure 8 Side-by-side fixation cross display
Figure 9 Side-by-side stimulus display
Figure 10 Choice display
Figure 11 Test instruction display
Figure 12 Test phase prediction and choice display
Figure 13 Experiment settings panel
Figure 14 Classifiers and artifact detection panel
Figure 15 Individual classifier configuration tab
Figure 16 Classes panel
Figure 17 Individual class configuration tab
Figure 18 Image classification tool
IEnumerable<EEGDataEntry> trials = /* Get trial data from somewhere */;
int minCount = trials.Min(t => t.Length);
var examples = trials.Select(t =>
var means = Channels.Values.Select(ch => t.Select(e => e[ch]).Average()).ToArray();
return new Example(t.Marker, t.Take(minCount)
.SelectMany(e => e.Select((d, i) => d - means[i]))
Figure 19 Data processing in C#
This code example demonstrates many powerful data processing features of C#. The purpose of this code is to convert a sequence (IEnumerable) of trials (each represented by an array of EEGDataEntries) into an array of Examples which can be fed to a classifier. This requires flattening the multi-dimensional data in each trial (some number of entries x 14 channel’s worth of data per entry) into a single sequence of voltage values. Along the way we also want to subtract out the mean voltage value along each channel, and then append these mean values to the end of the feature sequence. Furthermore, since by chance some trials have a few more examples than others, we need to ensure that the resultant example feature sets are trimmed such that the features in each example line up regardless of the length of the original trial. In many popular programming languages, this would require many lines of code with multiple loops. In C#, we require neither.
This code may seem arcane at first, but it is easy to break down. The code is peppered with inline anonymous functions (also called lambda functions). These functions are designated with the => operator, which can be interpreted as “goes to”. Thus the anonymous function t => t.Length takes one argument (t) and returns the Length property of t. In this case, t is an array of EEGDataEntries, but although C# is statically typed, there is no need to declare the type of t since the compiler can infer it from context. Similarly, note that the examples and means variables are declared with type var. This is a shortcut for declaring the types of these variables, which in this case are Example and double respectively. The compiler permits the use of var since those types can be inferred from context.
The next step in unpacking this code is to note the variety of functions in C# which operate on (and sometimes return) sequences (IEnumerables). These functions often take another function as an argument. For example, the Select function for a sequence of items of type T takes a function selector whose argument is of type T. Select then iterates over the sequence, evaluating selector for each item and returning a new sequence of the results.
Consider the code for determining the mean value along each EEG channel:
var means = Channels.Values.Select(ch => t.Select(e => e[ch]).Average()).ToArray();
Channels.Values is an array of all channel values. We call Select on this array, passing in the anonymous function:
ch => t.Select(e => e[ch]).Average()
as the selector function. Channels.Values is a sequence of items of type Channel. Thus, the selector function takes a Channel (ch) as an argument. The selector function then calls Select again, this time on t, which is an array of EEGDataEntries. The selector function for this call to Select:
e => e[ch]
Figure 20 The publish/subscribe interface to the Emotiv data stream
Figure 21 The yield-view presentation system
Figure 22 The parameter-description system
Figure 23 Eyes closed response on the O1 and O2 channels
Figure 24 Eye blink response on the AF3 and AF4 channels
public static void main(String args)
int num1 = 14, num2 = 16, numProbs = 1000;
Random r = new Random();
Set<String> set = new HashSet<String>();
for (int i = 0; i < numProbs; i++)
int target = (i % 2) == 0 ? num1 : num2;
int a, b, c, sign;
a = r.nextInt(num2) + 1;
if (a == num1 || a == num2)
b = r.nextInt(num2) + 1;
sign = r.nextBoolean() ? 1 : -1;
if (sign > 0 &&
(b == num1 || b == num2))
c = target - a - sign*b;
prob = String.format("%d %c %d %c %d",
(sign > 0 ? '+' : '-'),
(c > 0 ? '+' : '-'),
} while (c == 0
|| c == num1
|| c == num2
Figure 25 Expression generation code
Figure 26 Version 3 AdaBoost VP online accuracy
Figure 27 Version 3 AdaBoost DS online accuracy
Figure 28 Version 3 KNN online accuracy
This section illustrates a walkthrough of using Experimenter to configure and run a faces vs. scenes experiment.
1. Experiment Configuration
The experiment settings region allows for the configuration of a number of experimental parameters, including timing information and the number of stimuli to be displayed during each phase of the experiment. Also important is the ability to specify the types of output to be produced. By default, no output is saved since extraneous output is irritating when trying to test out an experiment. For real experiments, though, the Log Experiment option is highly recommended. There are also two data output formats. Save Raw Data logs all data collected during the experiment. However, this means that test phase data is marked with the unknown marker (-2) rather than with the correct class marker. This is because the correct class of the test data is not known by the application until the user declares it following prediction. In most cases, the Save Trial Data option is more useful. When this option is turned on, only correctly marked data collected during valid trials is logged. Both save modes can be run in parallel, generating two output files. Since each data point contains timing information, it is easy to match up saved trials with the raw data time series, which can be useful for some kinds of offline analysis. All output is saved to the folder indicated under Data Output Folder. Clicking on this link allows this folder to be changed. Finally, the Save and Load buttons allow experiment settings to be saved between uses of the application. Note that, depending on the size of the application on the screen, it may be necessary to scroll to see all of the configuration options available.
2. Load Stimuli