ISC implementation

josie.equita · October 30, 2020, 6:01pm

I was testing both the nltools and Brainiak implementations of ISC, and was getting wildly different p-values between the two. I was running it on Discovery with 1 node and 16ppn, and set the n_jobs parameter in the nltools implementation to the default value of -1 (which sets it to use all available processors to do the computation). I played around with the number of processors just to compare speed, and realized that specifying less processors (around 1-4) would give me more similar results to the Brainiak implementation, but increasing the number of processors gave me smaller and smaller p-values (and strangely also slower computing speed). This happens both when I change the n_jobs parameter or when I request different numbers of processors in the job script I submit to Discovery. I’m wondering if this is a bug and that the implementation wasn’t optimized to be parallelized with that many processors. I’m hoping someone can shed some light/look into this! Thanks in advance!

ljchang · October 30, 2020, 9:04pm

Hi @josie.equita, thanks for posting your experience and debugging. Can you say more about the versions of nltools and joblib you are using? We just encountered a different problem with another permutation function that ended up being an issue with using an old version of joblib.

josie.equita · October 30, 2020, 9:45pm

Hi @ljchang, thank you for responding! I have joblib version 0.17.0 and nltools version 0.4.2 installed in the environment I’m using. They both seem to be the latest version from what I have just searched, but please correct me if I’m mistaken. Let me know if there’s also anything else I could check.

ljchang · October 30, 2020, 9:51pm

Do you mind posting your experience as a github issue? @ejolly or I will take a look and see if we can figure out what’s going on. https://github.com/cosanlab/nltools/issues

ejolly · November 1, 2020, 2:31am

@josiequita Can you provide a little bit more information about how you’re calling ISC from within nltools? Also are you able to reproduce this on a non-cluster computer? Here’s a link to a notebook running on my local machine where I’m not able to reproduce this error with randomly generated data: notebook link

Also in general, I tend to avoid using n_jobs=-1 on the cluster because of how resource sharing works. To avoid premature killing of jobs and to be a good citizen for others, it’s preferable to be explicit (e.g. n_jobs=16). For example if requested 16ppn, and your job lands on a node with 64 cores, then the scheduler on Discovery will mark the other (64-16) 48 available for others to use. However, n_jobs=-1 will try to use everything on that node potentially causing issues for yourself or other users. Not an issue if your job lands on a machine with exactly the number of ppn you request and no one else is using that node.

Duplicated this response on github and askpbs.

Topic		Replies	Views
Adjustments for an fNIRS study? Naturalistic-Data.org	1	515	March 12, 2021
Have there been issues in downloading & using the nltools library for neuroimaging analysis? NLTools	0	4	February 12, 2025
ISC - Creating csv files within each ROI Naturalistic-Data.org	11	593	July 18, 2022
ISRSA - Parcellation mask and timeseries calculation Naturalistic-Data.org	0	292	October 27, 2022
RSA with nltools DartBrains.org	1	823	August 25, 2021

ISC implementation

Related topics