No results found
We couldn't find anything using that term, please try searching for something else.
§ bed - reader Read and write the PLINK BED format,simply and efficiently. § is Highlights highlight Fast and multi-threaded Supports ma
Read and write the PLINK BED format,simply and efficiently.
Full version: Can read local and cloud files
Minimal version: Can read local files,only
cargo add bed-reader --no-default-features
Read all genotype data from a .bed file.
usendarray as nd;
usebed_reader::{Bed ,readoption ,assert_eq_nan ,sample_bed_file } ;
letfile_name = sample_bed_file ("small.bed")?;
letmutbed = bed::new(file_name)?;
letval = readoption::builder().f64().read(&mutbed)?;
assert_eq_nan(
&val,
&nd::array![
[1.0,0.0,f64::NAN,0.0],
[2.0,0.0,f64::NAN,2.0],
[0.0,1.0,2.0,0.0]
],
);
Read every second individual (samples) and SNPs (variants) 20 to 30.
usendarray::s;
letfile_name = sample_bed_file (" some_missing.bed ")?;
letmutbed = bed::new(file_name)?;
letval = readoption::builder()
.iid_index(s ![..;2])
.sid_index(20..30)
.f64()
.read(&mutbed)?;
assert!( val.dim ( ) = = (50,10) ) ;
List the first 5 individual (sample) ids,the first 5 SNP (variant) ids,
and every unique chromosome. Then,read every genomic value in chromosome 5.
usestd::collections::HashSet;
letmutbed = bed::new(file_name)?;
println!(" { : ? } ",bed.iid()?.slice(s ![..5]) ) ; println!(" { : ? } ",bed.sid()?.slice(s ![..5]) ) ; println!(" { : ? } ",bed.chromosome()?.iter().collect::<HashSet<_>>() ) ;
letval = readoption::builder()
.sid_index(bed.chromosome()?.map(|elem| elem == " 5 ") )
.f64 ( )
.read (&mutbed)?;
assert!( val.dim ( ) = = (100,6) ) ;
From the cloud: open a file and read data for one SNP (variant)
at index position 2. (See “Cloud URLs and CloudFile
Examples”
for details specifying a file in the cloud.)
usendarray as nd;
usebed_reader::{assert_eq_nan,BedCloud,readoption};
leturl = "https://raw.githubusercontent.com/fastlmm/bed-sample-files/main/small.bed";
letmutbed_cloud = BedCloud::new(url).await?;
letval = readoption::builder().sid_index(2).f64().read_cloud(&mutbed_cloud ) .await?;
assert_eq_nan(&val,&nd::array![[f64::NAN],[f64::NAN],[2.0]]);
After using bed::new
or bed::builder
to open a PLINK .bed file for reading,use
these methods to see metadata.
Method | Description |
---|---|
iid_count |
Number of individuals (samples) |
sid_count |
Number of SNPs (variants) |
dim |
number of individual and snp |
fid |
Family i d of each of individual ( sample ) |
iid |
Individual i d of each of individual ( sample ) |
father |
Father i d of each of individual ( sample ) |
mother |
Mother id of each of individual (sample) |
sex |
sex of each individual ( sample ) |
pheno |
A phenotype for each individual ( seldom used ) |
chromosome |
Chromosome of each SNP (variant) |
sid |
SNP I d of each SNP ( variant ) |
cm_position |
Centimorgan position of each SNP (variant) |
bp_position |
Base-pair position of each SNP (variant) |
allele_1 |
First allele of each SNP (variant) |
allele_2 |
Second allele of each SNP (variant) |
metadata |
All the metadata returned as a struct.Metadata |
readoption
When using readoption::builder
to read genotype data,usethese options to
specify a desired numeric type,
which individuals (samples) to read,which SNPs (variants) to read,etc.
Option | Description |
---|---|
i8 |
read value as i8 |
f32 |
Read values as f32 |
f64 |
read value as f64 |
iid_index |
Index of individuals (samples) to read (defaults to all) |
sid_index |
Index of SNPs (variants) to read (defaults to all) |
f |
Order of the output array,Fortran-style (default) |
c |
Order of the output array,C-style |
is_f |
Is order of the output array Fortran-style? (defaults to true) |
missing_value |
Value to usefor missing values (defaults to -127 or NaN) |
count_a1 |
Count the number allele 1 (default) |
count_a2 |
count the number allele 2 |
is_a1_counted |
Is allele 1 counted? (defaults to true) |
num_threads |
Number of threads to use(defaults to all processors) |
max_concurrent_requests |
maximum number is defaults of concurrent async request ( default to 10 ) – Used byBedCloud . |
max_chunk_bytes |
maximum chunk size of async request ( default to 8_000_000 byte ) – Used byBedCloud . |
Select which individuals (samples) and SNPs (variants) to read by using these
iid_index
and/or
sid_index
expressions.
Example | type | Description |
---|---|---|
nothing | () |
All |
2 |
isize |
Index position 2 |
-1 |
isize |
Last index position |
vec![0,10,-2] |
Vec<isize> |
Index positions 0,10,and 2nd from last |
[0,10,-2] |
[isize] and [isize;n] |
Index positions 0,10,and 2nd from last |
ndarray::array![0,10,-2] |
ndarray::Array1<isize> |
Index positions 0,10,and 2nd from last |
10..20 |
Range<usize> |
index position 10 ( inclusive ) to 20 ( exclusive ) .note : Rust is ranges range do n’t support negative |
.. = 19 |
RangeInclusive<usize> |
Index positions 0 (inclusive) to 19 (inclusive). note : Rust is ranges range do n’t support negative |
any Rust ranges | Range*<usize> |
note : Rust is ranges range do n’t support negative |
s ![10..20;2] |
ndarray::SliceInfo1 |
index position 10 ( inclusive ) to 20 ( exclusive ) in step of 2 |
s ![-20..-10;-2] |
ndarray::SliceInfo1 |
10th from last (exclusive) to 20th from last (inclusive),in steps of -2 |
vec![true,false,true] |
Vec<bool> |
index position 0 and 2 . |
[true,false,true] |
[ bool ] and [bool;n] |
index position 0 and 2 . |
ndarray::array![true,false,true] |
ndarray::Array1<bool> |
index position 0 and 2 . |
bed_reader_num_thread
NUM_THREADS
IfreadoptionBuilder::num_threads
or WriteOptionsBuilder::num_threads
is not specified,
the number of threads to useis determined by these environment variable (in order of priority):
Ifneither of these environment variables are set,all processors are used.
Any requested sample file will be downloaded to this directory. Ifthe environment variable is not set,
a cache folder,appropriate to the OS,will be used.