Spatial decoding on neuropixels data

Reading the neuropixels data

Neuropixels can at the moment be read only on a windows computer while falcon is only linux based. To overcome this complication, We decided to use the open-ephys software to stream the data from the window computer into the linux computer (or docker image on the same station) running falcon.

To do that we are using in open ephys a chain of at least :

Note

For test purpose, the reader node can be replaced by a File Reader node to stream a previous recorded dataset without the need to setup the hardware.

On falcon side, the data are received by using as input node the OpenEphysZMQ node.

Note

In this use-case, Neuropixels data is used but any data read by open-ephys could be streamed. Or this step could also be completely remove when using a system compatible with falcon and replacing the OpenEphysZMQ input node by any existing data reader (neuralynx for example).

Reading the position data

The position data are streamed from your own tracking device position. Fklab has its own system streaming (at least) the x, y, positions and linearized position.

Speed thresholding

The processor compute and smooth the speed based on the position received in input. It can also computed it on the linearized position. The output is an even “crossed” or “not crossed”. It is used to enable the decoding processing only when the speed is below the threshold by using the shared state to_decode.

Spatial filtering

In falcon, there is global CAR filter implemented but it slow down a lot the pipeline. An idea could be to used the demuxed CAR filter implemented in Open-Ephys and optimized for Neuropixels.

Split the channels by likelihoods object

The distributor processor is used to split the data for each likelihood object on their separate stream / slots. The stream name associated is the filename of the likelihoods to use from the decoder model. The split is done following the channel map yaml file generated in offline (see requirements).

From here until the likelihoods merge at the end, the processing chain described after is computed independently on each stream via the use of slots.

Example: Here, data are splitter in 384 likelihoods and then streamed on 384 independent slots (1 by channel). Furthermore, the processing chain is parallelized in 7 threads by processors, meaning each processor is processing 48 channels received separately on 48 slots.

connections:
     - raw = splitter

     - splitter.data.(0-47) = filter0.data.(0-47)
     - splitter.data.(48-95) = filter1.data.(0-47)
     ...
     - splitter.data.(336-383) = filter7.data.(0-47)

     - filter(0-7).data.(0-47) = spike_detector(0-7).data.(0-47)
     - spike_detector(0-7).data.(0-47) = likelihoods(0-7).data.(0-47)
     - likelihoods(0-7).loglikelihood.(0-47) = decoder.loglikelihood.(0-383)
     - decoder = saver_out

Note

The multithreading can be configured by connecting the slots on different processors in the connection part of the graph. Internally some of this processors also have openmp parallelization creating a thread for each slots.

Preprocessing step

Usually the data is filtered by a bandpass filter. In the graph example after, a biquad filter filtering between 600 and 3000 Hz is used. (??) It needs and can be customized for each experiments.

This preprocessing steps needs to be similar/coherent with the preprocessing step used to train the model.

Data bufferization

Until this point, the number of samples by packet were flexible depending on what open-ephys sent. From here, data is rebufferized to match the decoding buffer size.

Compute the spike features

In this step, the spikeFeatures processor will detect spike and compute some features used later to compute the likelihoods.

There is a set of features available in this processor:

  • timestamp

  • amplitude

  • slope

  • channel index

  • channel depth. : a yaml file giving the match between depth and channel number needs to be given in input.

These features need to be identical to the features selected in the decoding model given in input. It also means that when creating the offline model in python the spike features names given needs to match theses.

Decode to obtain a posterior likelihood

This step is based on the pycompressed-decoder lib and can be split in two processors for better parallelization.

  • MultiLikelihoodSource : compute the logarithmic likelihood based on spike features

  • LikelihoodsMerger : merge all logarithmic likelihoods and compute the posterior likelihood.

The multilikelihoodsource can also be activated/inactivated based on a to_decode shared state. Usually, the criterion used to select if data can be used for decode is a speed threshold.

A third processor is used to load the offline model. It can also be used to train the model in real-time if the training model option is activated.

  • OnlineEncoderThe processor will load the model given in input and shared on two states:
    • a map of the stream name and the corresponding likelihood object (used by the MultiLikelihoodSource)

    • the decoder object (used by the LikelihoodsMerger)