Learning deep learning (project 3, generate TV script)

April 4th, 2017

In this class project, I generated my own Simpsons TV scripts using RNNs trained by the Simpsons dataset of scripts from 27 seasons. The Neural Network generated a new TV script for a scene at Moe’s Tavern.

This is the script generated by the network:

moe_szyslak: ya know, i think i'll volunteer, too.
barney_gumble: to homer! it's me! i'm the prime minister of ireland!
moe_szyslak: hey, homer, show ya, are you and, what's wrong which youse?
moe_szyslak: the point is, this drink is the ultimate?
man: yes, moe.
moe_szyslak: ah, that's okay. it's like my dad always said if you would never been so great.
homer_simpson: yeah, they're on top of the alcohol!
homer_simpson: wayne, maybe i can't.
moe_szyslak: ah, that's okay. it's like my dad always said that when i drink.
homer_simpson: you can't be right now what-- like, you should only drink to get back a favor.
homer_simpson: moe, why you bein' so generous and your name!(looks around) oh you, are you sure?
bart_simpson: square as" golden books," pop i had good writers. william faulkner could write an exhaust pipe gag that.
moe_szyslak:" sheriff andy" can't someone else do it

Does it make sense? :)

The full project with code can be found here:
dlnd_tv_script_generation_submit2.html

Author: Xu Cui Categories: deep learning Tags:

GPU is 40-80x faster than CPU in tensorflow for deep learning

April 4th, 2017

The speed difference of CPU and GPU can be significant in deep learning. But how much? Let’s do a test.

The computer:

The computer I use is a Amazon AWS instance g2.2xlarge (https://aws.amazon.com/ec2/instance-types/). The cost is $0.65/hour, or $15.6/day, or $468/mo. It has one GPU (High-performance NVIDIA GPUs, each with 1,536 CUDA cores and 4GB of video memory), and 8 vCPU (High Frequency Intel Xeon E5-2670 (Sandy Bridge) Processors). Memory is 15G.

The script:

I borrowed Erik Hallstrom’s code from https://medium.com/@erikhallstrm/hello-world-tensorflow-649b15aed18c

The code runs matrix multiplication and calculate the time when using CPU vs GPU.

from __future__ import print_function
import matplotlib
import matplotlib.pyplot as plt
import tensorflow as tf
import time

def get_times(maximum_time):

    device_times = {
        "/gpu:0":[],
        "/cpu:0":[]
    }
    matrix_sizes = range(500,50000,50)

    for size in matrix_sizes:
        for device_name in device_times.keys():

            print("####### Calculating on the " + device_name + " #######")

            shape = (size,size)
            data_type = tf.float16
            with tf.device(device_name):
                r1 = tf.random_uniform(shape=shape, minval=0, maxval=1, dtype=data_type)
                r2 = tf.random_uniform(shape=shape, minval=0, maxval=1, dtype=data_type)
                dot_operation = tf.matmul(r2, r1)

            with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session:
                    start_time = time.time()
                    result = session.run(dot_operation)
                    time_taken = time.time() - start_time
                    print(result)
                    device_times[device_name].append(time_taken)

            print(device_times)

            if time_taken > maximum_time:
                return device_times, matrix_sizes

device_times, matrix_sizes = get_times(1.5)
gpu_times = device_times["/gpu:0"]
cpu_times = device_times["/cpu:0"]

plt.plot(matrix_sizes[:len(gpu_times)], gpu_times, 'o-')
plt.plot(matrix_sizes[:len(cpu_times)], cpu_times, 'o-')
plt.ylabel('Time')
plt.xlabel('Matrix size')
plt.show()
plt.plot(matrix_sizes[:len(cpu_times)], [a/b for a,b in zip(cpu_times,gpu_times)], 'o-')
plt.ylabel('CPU Time / GPU Time')
plt.xlabel('Matrix size')
plt.show()

Result:
Similar to Erik’s original finding, we found huge difference between CPU and GPU. In this test, GPU is 40 - 80 times faster than CPU.

gpu_vs_cpu time

gpu_vs_cpu time

cpu time / gpu time

cpu time / gpu time

Author: Xu Cui Categories: deep learning Tags:

Updated loadHitachiText.m

March 16th, 2017

Some labs have been using our script readHitachiData.m to load NIRS data from Hitachi ETG machines. We recently found that some output MES data contains abnormal timestamp. For example, the timestamp should be like

16:49:25.406

But for some rows (although rarely), the time is like (note the ending character)

16:49:25.406E

This will cause our script to choke. We just fixed this issue, and you need to replace loadHitachiText.m. The new version can be found here.

Author: Xu Cui Categories: brain, nirs Tags:

Learning deep learning (project 2, image classification)

March 7th, 2017

In this class project, I built a network to classify images in the CIFAR-10 dataset. This dataset is freely available.

The dataset contains 60K color images (32×32 pixel) in 10 classes, with 6K images per class.

Here are the classes in the dataset, as well as 10 random images from each:

airplane
automobile
bird
cat
deer
dog
frog
horse
ship
truck

You can imagine it’s not possible to write down all rules to classify them, so we have to write a program which can learn.

The neural network I created contains 2 hidden layers. The first one is a convolutional layer with max pooling. Then drop out 70% of the connections. The second layer is a fully connected layer with 384 neurons.

def conv_net(x, keep_prob):
    """
    Create a convolutional neural network model
    : x: Placeholder tensor that holds image data.
    : keep_prob: Placeholder tensor that hold dropout keep probability.
    : return: Tensor that represents logits
    """
    # TODO: Apply 1, 2, or 3 Convolution and Max Pool layers
    #    Play around with different number of outputs, kernel size and stride
    # Function Definition from Above:
    #    conv2d_maxpool(x_tensor, conv_num_outputs, conv_ksize, conv_strides, pool_ksize, pool_strides)
    model = conv2d_maxpool(x, conv_num_outputs=18, conv_ksize=(4,4), conv_strides=(1,1), pool_ksize=(8,8), pool_strides=(1,1))
    model = tf.nn.dropout(model, keep_prob)

    # TODO: Apply a Flatten Layer
    # Function Definition from Above:
    #   flatten(x_tensor)
    model = flatten(model)

    # TODO: Apply 1, 2, or 3 Fully Connected Layers
    #    Play around with different number of outputs
    # Function Definition from Above:
    #   fully_conn(x_tensor, num_outputs)
    model = fully_conn(model,384)

    model = tf.nn.dropout(model, keep_prob)

    # TODO: Apply an Output Layer
    #    Set this to the number of classes
    # Function Definition from Above:
    #   output(x_tensor, num_outputs)
    model = output(model,10)

    # TODO: return output
    return model

Then I trained this network using Amazon AWS g2.2xlarge instance. This instance has GPU which is much faster for deep learning (than CPU). I did a simple experiment and find GPU is at least 3 times faster than CPU:

if all layers in gpu: 14 seconds to run 4 epochs,
if conv layer in cpu, other gpu, 36 seconds to run 4 epochs

This is apparently a very crude comparison but GPU is definitely much faster than CPU (at least the ones in AWS g2.2xlarge, cost: $0.65/hour)

Eventually I got ~70% accuracy on the test data, much better than random guess (10%). The time to train the model is ~30 minutes.

You can find my entire code at:
http://www.alivelearn.net/deeplearning/dlnd_image_classification_submission2.html

Author: Xu Cui Categories: brain, deep learning Tags:

Learning deep learning on Udacity

February 9th, 2017

I am taking Udacity’s deep learning class at https://www.udacity.com/course/deep-learning-nanodegree-foundation–nd101

I have done the first project, creating a neural network with 1 hidden layer (so not deep enough :)) to predict bike demands for a bike rental company. The data are real-life data; so this project is actually has real applications. In a nutshell, we can predict how many bikes will be rented in a given day based on factors such as the weather, whether the day is a holiday, etc.

The same model can also be used in other applications such as predicting number of customers of a clothes shop, or of a website.

My homework for this project can be found here:
http://www.alivelearn.net/deeplearning/dlnd-your-first-neural-network.html

Author: Xu Cui Categories: deep learning Tags:

Chin rest (head holder) device for NIRS

January 30th, 2017

When we set up our NIRS lab back in 2008, we needed a device to prevent participants’ head movement during the experiment and during the digitizer measurement. Even though NIRS is tolerant to head motion, we still want to minimize it. During the digitizer measurement phase, the probe will poke the participants’ heads, resulting inaccurate probe position. We definitely need something to minimize it.

In addition, we feared that metal might interfere the magnetic positioning system (digitizer), so we wanted the device to be all-plastic.

We contacted Ben Krasnow , who has been very helpful in creating MRI compatible devices (e.g. keyboard) for Lucas Center @ Stanford in the past. He suggested us use University of Houston’s “headspot”.

Headspot

Ben then replaced the metal part with plastics.

we have been using it for almost 10 years! It works great, as expected. The height is also adjustable. I recently checked the price and it is $500, which is slightly higher than in 2008 ($415), but not much different. Ben charged $325 to replace the metal. The total (with tax) was $774.

headspot webpage

headspot webpage

Author: Xu Cui Categories: brain, nirs Tags:

We contributed to MatLab (wavelet toolbox)

January 25th, 2017

We use MatLab a lot! It’s the major program for brain imaging data analysis in our lab. However, I never thought we could actually contribute to MatLab’s development.

In MatLab 2016, there is a toolbox called Wavelet Toolbox. If you read the document on wavelet coherence (link below), you will find that they used our NIRS data as an example:

https://www.mathworks.com/help/wavelet/examples/compare-time-frequency-content-in-signals-with-wavelet-coherence.html

Back in 2015/4/9, Wayne King from MathWorks contacted us, saying that they are developing the wavelet toolbox and asking if we can share some data as an example. We did. I’m glad that it’s part of the package now.

The following section are from the page above:


Find Coherent Oscillations in Brain Activity

In the previous examples, it was natural to view one time series as influencing the other. In these cases, examining the lead-lag relationship between the data is informative. In other cases, it is more natural to examine the coherence alone.

For an example, consider near-infrared spectroscopy (NIRS) data obtained in two human subjects. NIRS measures brain activity by exploiting the different absorption characteristics of oxygenated and deoxygenated hemoglobin. The data is taken from Cui, Bryant, & Reiss (2012) and was kindly provided by the authors for this example. The recording site was the superior frontal cortex for both subjects. The data is sampled at 10 Hz. In the experiment, the subjects alternatively cooperated and competed on a task. The period of the task was approximately 7.5 seconds.

load NIRSData;
figure
plot(tm,NIRSData(:,1))
hold on
plot(tm,NIRSData(:,2),'r')
legend('Subject 1','Subject 2','Location','NorthWest')
xlabel('Seconds')
title('NIRS Data')
grid on;
hold off;

Obtain the wavelet coherence as a function of time and frequency. You can use wcoherence to output the wavelet coherence, cross-spectrum, scale-to-frequency, or scale-to-period conversions, as well as the cone of influence. In this example, the helper function helperPlotCoherence packages some useful commands for plotting the outputs of wcoherence.

[wcoh,~,F,coi] = wcoherence(NIRSData(:,1),NIRSData(:,2),10,'numscales',16);
helperPlotCoherence(wcoh,tm,F,coi,'Seconds','Hz');

In the plot, you see a region of strong coherence throughout the data collection period around 1 Hz. This results from the cardiac rhythms of the two subjects. Additionally, you see regions of strong coherence around 0.13 Hz. This represents coherent oscillations in the subjects’ brains induced by the task. If it is more natural to view the wavelet coherence in terms of periods rather than frequencies, you can use the ‘dt’ option and input the sampling interval. With the ‘dt’ option, wcoherence provides scale-to-period conversions.

[wcoh,~,P,coi] = wcoherence(NIRSData(:,1),NIRSData(:,2),seconds(0.1),...
    'numscales',16);
helperPlotCoherence(wcoh,tm,seconds(P),seconds(coi),...
    'Time (secs)','Periods (Seconds)');

Again, note the coherent oscillations corresponding to the subjects’ cardiac activity occurring throughout the recordings with a period of approximately one second. The task-related activity is also apparent with a period of approximately 8 seconds. Consult Cui, Bryant, & Reiss (2012) for a more detailed wavelet analysis of this data.

Conclusions

In this example you learned how to use wavelet coherence to look for time-localized coherent oscillatory behavior in two time series. For nonstationary signals, it is often more informative if you have a measure of coherence that provides simultaneous time and frequency (period) information. The relative phase information obtained from the wavelet cross-spectrum can be informative when one time series directly affects oscillations in the other.

References

Cui, X., Bryant, D.M., and Reiss. A.L. “NIRS-Based hyperscanning reveals increased interpersonal coherence in superior frontal cortex during cooperation”, Neuroimage, 59(3), pp. 2430-2437, 2012.

Grinsted, A., Moore, J.C., and Jevrejeva, S. “Application of the cross wavelet transform and wavelet coherence to geophysical time series”, Nonlin. Processes Geophys., 11, pp. 561-566, 2004.

Maraun, D., Kurths, J. and Holschneider, M. “Nonstationary Gaussian processes in wavelet domain: Synthesis, estimation and significance testing”, Phys. Rev. E 75, pp. 016707(1)-016707(14), 2007.

Torrence, C. and Webster, P. “Interdecadal changes in the ESNO-Monsoon System,” J.Clim., 12, pp. 2679-2690, 1999.

Author: Xu Cui Categories: brain, matlab, nirs, programming Tags:

Deep learning

January 20th, 2017

In the past months, I am shocked by the progress of artificial intelligence (mostly implemented by deep learning). In March 2016, AlphaGo won Lee Sedol (李世石) in Weiqi (go). I had mixed feelings, excited, sad, and some fear. Around new year of 2017, AlphaGo won 60 games in a row against numerous top professional Weiqi players in China, Korea and Japan, including #1 Ke Jie. There is no doubt AlphaGo is at least a level better than top human player. It’s interesting to see that the way how people call AlphaGo has changed from “dog” to “Teacher Ah”, reflecting the change of our attitude toward artificial intelligence.

Game is not the only area where AI shocked me. Below are some area AI / deep learning has done extremely well:

  1. convert text to handwriting: Try yourself at http://www.cs.toronto.edu/~graves/handwriting.html Maybe in the future you can use AI to write your greeting cards.
  2. Apply artistic style to drawings. Check out https://www.youtube.com/watch?v=Uxax5EKg0zA and https://www.youtube.com/watch?v=jMZqxfTls-0
  3. Fluid simulation
  4. Generate a text description of an image
  5. Real time facial expression transfer https://www.youtube.com/watch?v=mkI6qfpEJmI
  6. Language translation
  7. Handwriting recognition (try it here: http://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html) This is not new progress but still worth mentioning
  8. Medical diagnosis
  9. And many more. I will update this list constantly
In the field of biology and medicine, deep learning also progresses rapidly. Below is the number of publications using keyword “deep learning” in PubMed.
deep learning publications in PubMed
deep learning publications in PubMed
“Deep Learning” is also a keyword in my Stork. I got new papers almost every day.
Some resources to learn more about deep learning and keep updated:
  1. Track “Deep Learning” publications using Stork
  2. Subscribe youtube channel Two Minute Papers (https://www.youtube.com/user/keeroyz). It contains many excellent short videos on the application of deep learning
  3. Play it here: http://playground.tensorflow.org/
  4. A few examples here: http://cs.stanford.edu/people/karpathy/convnetjs/
  5. I am going to take Udacity’s deep learning class at https://www.udacity.com/course/deep-learning-nanodegree-foundation–nd101
Author: Xu Cui Categories: deep learning, programming Tags:

Stork开通自动翻译功能啦

January 17th, 2017

虽然美国读研究生的时候,一个很头疼的问题就是阅读科学文献。原因很多,但是其中一个就是文献里面有许多单词不知道意思,这就时不时地要查字典。到后来情况就逐渐变好了,但是在接触到新的领域的时候,又出现一些该领域的新词汇,又要学习这些单词。当时我想,如果有个工具可以自动把文献翻译成汉语就好了。

随着人工智能在翻译领域的进展,长句子的准确翻译变为可能。当然,与“信达雅”还是差距很大,不过通过一些例子感觉已经是达到“信”了,“达”也初步达到了。或许以后还可以“雅”。对科研人来讲,能够“信”其实就已经很好了,毕竟对英文也不是一点都不懂。

看到技术成熟,我们就给Stork(文献鸟)增添了高级功能:翻译!当Stork给我发邮件的时候,邮件里的文献已经翻译好了:

这样我就可以快速浏览文献,省不少时间。

打开某个文献,里面的摘要也翻译了:

总体感觉翻译得还不错,比普通的逐字翻译要好很多,可以作为阅读的辅助功能。

该功能是付费功能。如果感兴趣的话,可以参考下面页面的操作:

http://www.storkapp.me/translation.html

Author: Xu Cui Categories: programming, stork, web Tags:

Communications between two MatLabs (2): over socket

October 17th, 2016

Aaron Piccirilli

Aaron Piccirilli

After the previous blog post Communications between two MatLabs (1) over file, Aaron Piccirilli in our lab suggested a more efficient way to communicate between two matlabs, i.e. over socket. Below is the source code provided by Aaron:

udpSocket = udp('127.0.0.1', 'LocalPort', 2121, 'RemotePort', 2122);
fopen(udpSocket);
udpCleaner = onCleanup(@() fclose(udpSocket));
for ii = 1:100
    fprintf(udpSocket, '%d%f', [ii ii/100]);
    disp(['Sending ' num2str(ii)]);
    pause(0.1);
end
udpSocket = udp('127.0.0.1', 'LocalPort', 2122, 'RemotePort', 2121);
fopen(udpSocket);
udpCleaner = onCleanup(@() fclose(udpSocket));
    while(1)
    if udpSocket.BytesAvailable
        ii = fscanf(udpSocket, '%d%f');
        disp(['Received' num2str(ii(1)) '-' num2str(ii(2))])
    end
end

More words from Aaron:

I also wrote a couple of quick tests to compare timing by having each method pass 10,000 random integers as quickly as they could. Using UDP is over four times faster on my work machine, and would be sufficient to keep up with sampling rates up to about 900 Hz, whereas the file-based transfer became too slow at about 200 Hz.

Obviously these rates and timings are going to be system and data-dependent, but the UDP implementation is about the same amount of code. It has some added benefits, too. First is what I mentioned before - that this allows you to communicate between different languages. Second, though, is what might be more important: buffer management. If your data source is sending data faster than you can process it, even for just a moment, the UDP method handles that gracefully with a buffer. To get the same functionality with the file method you have to write your own buffer maintenance - not too tricky, but adds another layer of complexity and probably won’t be as efficient.

I did a similar timing test passing 40 floats each time (say for 20 channels of NIRS data) instead of a single integer and the timing did not really change on my machine.

I also tested the above scripts, and they work beautifully! I definitely recommend this method over the ‘file’ method. One thing to note: when you Ctrl-C to quit the program, remember to close the socket (fclose(udpSocket)) AND clean the variables (udpSocket, udpCleaner); otherwise you will run into the “Unsuccessful open: Unrecognized Windows Sockets error: 0: Cannot bind” error.

Note from Aaron:

One note: the onCleanup function/object is designed as a callback of sorts: no matter how the function exits (normally, error, crash, Ctrl-C), when the onCleanup object is automatically then destroyed, its anonymous function should run. Thus, the UDP connection should be closed no matter how you exit the function. This won’t work for a script, though, or if you were just running the code on its own in a command window, as the onCleanup object wouldn’t be automatically destroyed. I would just exclude that line completely if you weren’t running it as a function.

Author: Xu Cui Categories: matlab, programming Tags: