hub

Arc Virtual Cell Atlas: scRNA-seq

The Arc Virtual Cell Atlas hosts one of the biggest collections of scRNA-seq datasets.

Lamin mirrors the dataset for simplified access here: https://lamin.ai/laminlabs/arc-virtual-cell-atlas

If you use the data academically, please cite the original publications, Youngblut et al. (2025) and Zhang et al. (2025).

Connect to the public LaminDB instance that mirrors cellxgene:

# pip install 'lamindb[jupyter,bionty,wetlab,gcp]'
!lamin connect laminlabs/arc-virtual-cell-atlas
Hide code cell output
 connected lamindb: laminlabs/arc-virtual-cell-atlas
import lamindb as ln
import bionty as bt
import wetlab as wl
Hide code cell output
 connected lamindb: laminlabs/arc-virtual-cell-atlas

Metadata

21 organisms.

bt.Organism.df()
uid name ontology_id scientific_name synonyms description space_id source_id run_id created_at created_by_id _aux _branch_code
id
1 7PRZ0StI thale-cress NCBITaxon:3702 Arabidopsis thaliana thale cress|mouse-ear cress None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
2 NY4Fwo6n dairy cow NCBITaxon:9913 Bos taurus ox|oxen|bovine|domestic cow|cattle|cow|domesti... None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
3 5CDgMx83 caenorhabditis elegans NCBITaxon:6239 Caenorhabditis elegans None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
4 3qUn9Zrd white-tufted-ear marmoset NCBITaxon:9483 Callithrix jacchus white ear-tufted marmoset|common marmoset None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
5 7bjJiyyC zebra danio NCBITaxon:7955 Danio rerio zebra fish|leopard danio|zebrafish None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
6 5JGkUUo0 fruit fly NCBITaxon:7227 Drosophila melanogaster None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
7 s7OX44As equine NCBITaxon:9796 Equus caballus horse|domestic horse None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
8 4fzSxbbW chicken NCBITaxon:9031 Gallus gallus Gallus domesticus|chickens|bantam None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
9 38ZOM8IO western gorilla NCBITaxon:9593 Gorilla gorilla gorilla None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
10 2R9u9P71 naked mole rat NCBITaxon:10181 Heterocephalus glaber naked mole-rat None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
11 1dpCL6Td human NCBITaxon:9606 Homo sapiens None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
12 2AmW7N1E rhesus macaque NCBITaxon:9544 Macaca mulatta rhesus monkeys|rhesus macaques|Rhesus monkey None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
13 3RZqbcSL house mouse NCBITaxon:10090 Mus musculus mouse None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
14 2cmrlaa8 european rabbit NCBITaxon:9986 Oryctolagus cuniculus rabbits|rabbit|Japanese white rabbit|domestic ... None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
15 6mtjopSB asian cultivated rice NCBITaxon:4530 Oryza sativa rice|red rice None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
16 1e0pqlXq wild sheep NCBITaxon:9940 Ovis aries sheep|lambs|domestic sheep None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
17 4V2ejovf chimpanzee NCBITaxon:9598 Pan troglodytes None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
18 5lgRwIC9 schistosoma mansoni NCBITaxon:6183 Schistosoma mansoni None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
19 2Dgc0iBG tomato NCBITaxon:4081 Solanum lycopersicum None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
20 7N29aZih swine NCBITaxon:9823 Sus scrofa wild boar|pigs|pig None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1
21 70VIwHUJ maize NCBITaxon:4577 Zea mays None None 1 10 2 2025-02-25 20:05:50.334478+00:00 1 None 1

50 cell lines.

bt.CellLine.df()
Hide code cell output
uid name ontology_id abbr synonyms description space_id source_id run_id created_at created_by_id _aux _branch_code
id
1 505Oto0b NCI-H1573 CVCL_1478 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
2 2yrJ1RO9 NCI-H460 CVCL_0459 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
3 729jQiCV hTERT-HPNE CVCL_C466 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
4 2MwkQgWO SW48 CVCL_1724 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
5 6SFnBlyJ HOP62 CVCL_1285 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
6 39vFskbz NCI-H1792 CVCL_1495 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
7 3Yy5mGIS SW480 CVCL_0546 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
8 1p59Uds7 HT-29 CVCL_0320 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
9 7KUHx7VC LoVo CVCL_0399 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
10 4Ch2fV9a Hs 766T CVCL_0334 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
11 65gF96lU PANC-1 CVCL_0480 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
12 6h8KcJYp MIA PaCa-2 CVCL_0428 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
13 7aEsdKjg SW 1271 CVCL_1716 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
14 4n6gUGHY RKO CVCL_0504 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
15 1v7Mehiu H4 CVCL_1239 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
16 HLBTHKPg SW1417 CVCL_1717 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
17 tdp1HNAN CFPAC-1 CVCL_1119 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
18 5Jp9rqX7 SW 900 CVCL_1731 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
19 5Vjc1Ubr KATO III CVCL_0371 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
20 EtRJf7f9 C-33 A CVCL_1094 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
21 18kaNqu0 SNU-1 CVCL_0099 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
22 4IfJB0Y2 J82 CVCL_0359 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
23 3PDtUj4s A-172 CVCL_0131 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
24 NvPXo2Hu SHP-77 CVCL_1693 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
25 5N1doHnP SNU-423 CVCL_0366 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
26 6GOSOOui HS-578T CVCL_0332 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
27 7QShig8F A498 CVCL_1056 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
28 6NWX3dtq NCI-H2347 CVCL_1550 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
29 VEd9akJo LOX-IMVI CVCL_1381 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
30 bC8JbRlg NCI-H23 CVCL_1547 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
31 5ewuYry0 Panc 03.27 CVCL_1635 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
32 1CdEQ5dJ LS 180 CVCL_0397 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
33 1mLuzzow HEC-1-A CVCL_0293 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
34 63ZWvcHV HCT15 CVCL_0292 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
35 5XupQdHO COLO 205 CVCL_0218 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
36 4hyU9oFu BT-474 CVCL_0179 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
37 5lqReFKR AN3 CA CVCL_0028 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
38 1lhqeW2v RPMI-7951 CVCL_1666 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
39 39rNVaPP SK-MEL-2 CVCL_0069 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
40 vEfTp1Hk A549 CVCL_0023 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
41 4QH2SpWA NCI-H2030 CVCL_1517 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
42 1K1CzNSi C32 CVCL_1097 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
43 3Oz9gRsu HepG2/C3A CVCL_1098 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
44 5JZNtoDJ AsPC-1 CVCL_0152 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
45 2eQosYls CHP-212 CVCL_1125 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
46 219BOZMe SW 1088 CVCL_1715 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
47 6O2MPQMm A-427 CVCL_1055 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
48 J5Ylm8TV NCI-H2122 CVCL_1531 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
49 7dL2LJjx NCI-H661 CVCL_1577 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1
50 7VaGVBNB NCI-H596 CVCL_1571 None None None 1 None 3 2025-02-25 22:20:20.217993+00:00 1 None 1

100 compounds.

wl.Compound.df()
Hide code cell output
uid name ontology_id chembl_id abbr synonyms description space_id source_id run_id created_at created_by_id _aux _branch_code
id
380 JRDV3CsZ Tolmetin None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
379 x3BfjX6J Peretinoin None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
378 18PJ8Lu8 Niclosamide (olamine) None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
377 5yeFtKHy Apalutamide None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
376 2AEfoFfq Mifepristone None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
285 enh3vp0f Ixazomib None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
284 4iDSuajq Pimitespib None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
283 5y8FwPtA 4EGI-1 None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
282 5lagGAeu Torkinib None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1
281 1NkoYXbq Tubulin inhibitor 6 None None None None None 1 None 3 2025-02-25 22:48:58.568677+00:00 1 None 1

100 rows × 14 columns

The Tahoe collection

Every individual dataset in the atlas is an .h5ad file that is registered as an artifact in LaminDB.

Let us first query for the Tahoe collection.

collection = ln.Collection.get(key="tahoe100")
collection.artifacts.df()
Hide code cell output
! no run & transform got linked, call `ln.track()` & re-run
! run input wasn't tracked, call `ln.track()` and re-run
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux _branch_code
id
1369 XVSrkq9pyF1OBLgG0000 2025-02-25/h5ad/plate3_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 13173722269 Jnrt7DaSUCGn8D8LS2itaw None 4705402 md5 False False 1 2 3 None True 1 2025-02-25 23:22:20.497965+00:00 1 None 1
1366 vn5cUJCHbjpPPsZx0000 2025-02-25/h5ad/plate14_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 22427932564 FrnStRehP16siRGG35ou+g None 6518806 md5 False False 1 2 3 None True 1 2025-02-25 23:22:19.357999+00:00 1 None 1
1362 56uA9lPPmJ4zLUcr0000 2025-02-25/h5ad/plate10_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 26536400717 j1FXsX7hs7u+eBqnWnmNHw None 8044908 md5 False False 1 2 3 None True 1 2025-02-25 23:22:17.849980+00:00 1 None 1
1365 9L9HZ55HqUL0aqaR0000 2025-02-25/h5ad/plate13_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 28071589885 RKOiaay+CHvv+Ukk/N+28A None 8501658 md5 False False 1 2 3 None True 1 2025-02-25 23:22:18.977981+00:00 1 None 1
1372 aAHQ3zbD7n1asyYr0000 2025-02-25/h5ad/plate6_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 28934897078 NYvQEqVClziHm0ozWhOw1w None 7545393 md5 False False 1 2 3 None True 1 2025-02-25 23:22:21.629962+00:00 1 None 1
1373 DC5cacdJr1VoEXnl0000 2025-02-25/h5ad/plate7_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 16514746341 NOS4MY6eYYPOnAB8ViyWYg None 5692117 md5 False False 1 2 3 None True 1 2025-02-25 23:22:22.009157+00:00 1 None 1
1375 BDttiuV3Te8VB0dU0000 2025-02-25/h5ad/plate9_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 18791302576 4kHbVbmreg6akW6ZgsjxaA None 5866669 md5 False False 1 2 3 None True 1 2025-02-25 23:22:22.759201+00:00 1 None 1
1374 czC19UpUEszVH2bU0000 2025-02-25/h5ad/plate8_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 30390935958 ilAzEPIh4FlDeTFaJ1dILw None 8880979 md5 False False 1 2 3 None True 1 2025-02-25 23:22:22.387666+00:00 1 None 1
1370 tKTeff0ugWqAm4P70000 2025-02-25/h5ad/plate4_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 23292672278 BkBXznbSovNWXtzPFITPcQ None 7004356 md5 False False 1 2 3 None True 1 2025-02-25 23:22:20.879928+00:00 1 None 1
1371 EZATJLC4jE7pmwo40000 2025-02-25/h5ad/plate5_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 19763140865 VMBKFzOI5cj7UC1UDENP4A None 6419498 md5 False False 1 2 3 None True 1 2025-02-25 23:22:21.255154+00:00 1 None 1
1367 aJIqo7bNyJAs9z0r0000 2025-02-25/h5ad/plate1_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 19070623904 9iCNcouMqfNS3HA/2GUWOA None 5481420 md5 False False 1 2 3 None True 1 2025-02-25 23:22:19.737995+00:00 1 None 1
1364 S2h2rPLCaUhZAM9u0000 2025-02-25/h5ad/plate12_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 37495736876 VjAkWVFGVpzAMi9Innusuw None 10487057 md5 False False 1 2 3 None True 1 2025-02-25 23:22:18.600910+00:00 1 None 1
1368 ZFeVfd0ugAHeWCxm0000 2025-02-25/h5ad/plate2_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 29037152127 usxviuqGbuw0RYnECCVCWw None 8064658 md5 False False 1 2 3 None True 1 2025-02-25 23:22:20.113956+00:00 1 None 1
1363 omn7JStfJMzy8m6O0000 2025-02-25/h5ad/plate11_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 23230802756 N2mzoYlMLEl6PdecaYyDvw None 7435869 md5 False False 1 2 3 None True 1 2025-02-25 23:22:18.229629+00:00 1 None 1

Each of the datasets were validated with the same schema:

schema = ln.Schema.get(name="tahoe100_anndata_schema")
ln.Artifact.filter(schema=schema).df()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux _branch_code
id
1369 XVSrkq9pyF1OBLgG0000 2025-02-25/h5ad/plate3_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 13173722269 Jnrt7DaSUCGn8D8LS2itaw None 4705402.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:20.497965+00:00 1 None 1
1366 vn5cUJCHbjpPPsZx0000 2025-02-25/h5ad/plate14_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 22427932564 FrnStRehP16siRGG35ou+g None 6518806.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:19.357999+00:00 1 None 1
1362 56uA9lPPmJ4zLUcr0000 2025-02-25/h5ad/plate10_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 26536400717 j1FXsX7hs7u+eBqnWnmNHw None 8044908.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:17.849980+00:00 1 None 1
1365 9L9HZ55HqUL0aqaR0000 2025-02-25/h5ad/plate13_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 28071589885 RKOiaay+CHvv+Ukk/N+28A None 8501658.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:18.977981+00:00 1 None 1
1372 aAHQ3zbD7n1asyYr0000 2025-02-25/h5ad/plate6_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 28934897078 NYvQEqVClziHm0ozWhOw1w None 7545393.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:21.629962+00:00 1 None 1
1373 DC5cacdJr1VoEXnl0000 2025-02-25/h5ad/plate7_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 16514746341 NOS4MY6eYYPOnAB8ViyWYg None 5692117.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:22.009157+00:00 1 None 1
1375 BDttiuV3Te8VB0dU0000 2025-02-25/h5ad/plate9_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 18791302576 4kHbVbmreg6akW6ZgsjxaA None 5866669.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:22.759201+00:00 1 None 1
15 3Yl20zyG926CkvP50000 2025-02-25/tutorial/plate3_2k-obs.h5ad.gz None .h5ad.gz dataset AnnData 7253540 vv16qryJsVY98jDBqhkr9w None NaN md5 False False 1 2 3 None True 1 2025-02-25 19:31:01.255128+00:00 1 None 1
1374 czC19UpUEszVH2bU0000 2025-02-25/h5ad/plate8_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 30390935958 ilAzEPIh4FlDeTFaJ1dILw None 8880979.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:22.387666+00:00 1 None 1
1370 tKTeff0ugWqAm4P70000 2025-02-25/h5ad/plate4_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 23292672278 BkBXznbSovNWXtzPFITPcQ None 7004356.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:20.879928+00:00 1 None 1
1371 EZATJLC4jE7pmwo40000 2025-02-25/h5ad/plate5_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 19763140865 VMBKFzOI5cj7UC1UDENP4A None 6419498.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:21.255154+00:00 1 None 1
1367 aJIqo7bNyJAs9z0r0000 2025-02-25/h5ad/plate1_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 19070623904 9iCNcouMqfNS3HA/2GUWOA None 5481420.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:19.737995+00:00 1 None 1
1364 S2h2rPLCaUhZAM9u0000 2025-02-25/h5ad/plate12_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 37495736876 VjAkWVFGVpzAMi9Innusuw None 10487057.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:18.600910+00:00 1 None 1
1368 ZFeVfd0ugAHeWCxm0000 2025-02-25/h5ad/plate2_filt_Vevo_Tahoe100M_WSe... None .h5ad.gz dataset AnnData 29037152127 usxviuqGbuw0RYnECCVCWw None 8064658.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:20.113956+00:00 1 None 1
1363 omn7JStfJMzy8m6O0000 2025-02-25/h5ad/plate11_filt_Vevo_Tahoe100M_WS... None .h5ad.gz dataset AnnData 23230802756 N2mzoYlMLEl6PdecaYyDvw None 7435869.0 md5 False False 1 2 3 None True 1 2025-02-25 23:22:18.229629+00:00 1 None 1

Here is how the features in the dataset look like:

artifact = ln.Artifact.filter(schema=schema).first()  # the first in the collection
artifact.describe()
Artifact .h5ad.gz/AnnData
├── General
│   ├── .uid = 'XVSrkq9pyF1OBLgG0000'
│   ├── .key = '2025-02-25/h5ad/plate3_filt_Vevo_Tahoe100M_WServicesFrom_ParseGigalab.h5ad.gz'
│   ├── .size = 13173722269
│   ├── .hash = 'Jnrt7DaSUCGn8D8LS2itaw'
│   ├── .n_observations = 4705402
│   ├── .path = gs://arc-ctc-tahoe100/2025-02-25/h5ad/plate3_filt_Vevo_Tahoe100M_WServicesFrom_ParseGigalab.h5ad.gz
│   ├── .created_by = sunnyosun (Sunny Sun)
│   ├── .created_at = 2025-02-25 23:22:20
│   └── .transform = 'Register Tahoe-100M'
├── Dataset features/schema
│   ├── var62710                 [bionty.Gene.stable_id]                                             
│   │   TSPAN6                      int                                                                 
│   │   TNMD                        int                                                                 
│   │   DPM1                        int                                                                 
│   │   SCYL3                       int                                                                 
│   │   C1orf112                    int                                                                 
│   │   FGR                         int                                                                 
│   │   CFH                         int                                                                 
│   │   FUCA2                       int                                                                 
│   │   GCLC                        int                                                                 
│   │   NFYA                        int                                                                 
│   │   STPG1                       int                                                                 
│   │   NIPAL3                      int                                                                 
│   │   LAS1L                       int                                                                 
│   │   ENPP4                       int                                                                 
│   │   SEMA3F                      int                                                                 
│   │   CFTR                        int                                                                 
│   │   ANKIB1                      int                                                                 
│   │   CYP51A1                     int                                                                 
│   │   KRIT1                       int                                                                 
│   │   RAD52                       int                                                                 
│   └── obs16                    [Feature]                                                           
cell_line                   cat[bionty.CellLine.onto…  A-172, A-427, A498, A549, AN3 CA, AsPC-1…
cell_name                   cat[bionty.CellLine]       A-172, A-427, A498, A549, AN3 CA, AsPC-1…
drug                        cat[wetlab.Compound]       4EGI-1, 9-ING-41, APTO-253, AT7519, AZD-…
drugname_drugconc           cat[wetlab.CompoundPertu…  [('4EGI-1', 5.0, 'uM')], [('9-ING-41', 5…
pass_filter                 cat[ULabel[PassFilter]]    full, minimal                            
phase                       cat[ULabel[Phase]]         G1, G2M, S                               
plate                       cat[ULabel[Plate]]         plate3                                   
sample                      cat[wetlab.Biosample]      smp_1687, smp_1688, smp_1689, smp_1690, …
gene_count                  int                                                                 
tscp_count                  int                                                                 
mread_count                 int                                                                 
pcnt_mito                   float                                                               
S_score                     float                                                               
G2M_score                   float                                                               
sublibrary                  str                                                                 
BARCODE                     str                                                                 
└── Labels
    └── .references                 Reference                  Tahoe-100M: A Giga-Scale Single-Cell Per…
        .compounds                  wetlab.Compound            NG25, venetoclax, Gemcitabine, PH-797804…
        .compound_perturbations     wetlab.CompoundPerturbat…  [('Pemigatinib', 5.0, 'uM')], [('Belzuti…
        .biosamples                 wetlab.Biosample           smp_1760, smp_1764, smp_1754, smp_1743, …
        .organisms                  bionty.Organism            human                                    
        .cell_lines                 bionty.CellLine            Panc 03.27, PANC-1, NCI-H460, HEC-1-A, M…
        .ulabels                    ULabel                     tahoe-100, plate3, G1, G2M, S, full, min…

The genes are indexed with a “stable ID”, a unique mix of Ensembl gene ID and gene symbol.

Every AnnData object measures a broad range of perturbations, biosamples and cell lines. The plates are approximate replicates for each other.

You can download an .h5ad into your local cache like so:

artifact.cache()

Note that unlike what the suffix suggests, the .h5ad is presently not compressed.