Bird’s eye view#

You typically want to know where files & datasets came from.

Here, you’ll backtrace file transformations through notebooks, pipelines & app uploads in a complex research project (based on Schmidt22).

Can you give more concrete reasons why I should care about data lineage?

It allows to trace and rely on biological insights, verify experimental outcomes, meet stringent regulatory standards, and generally increase the reproducibility of scientific discoveries.

While tracking data lineage is easier when it’s governed by deterministic pipelines, it becomes hard when its governed by interactive human-driven analyses.

This is where LaminDB fills a gap in the tools space.

Hide code cell content
# initialize a test instance for this notebook
# this should be run before importing lamindb in Python
!lamin login testuser1
!lamin delete mydata
!lamin init --storage ./mydata
!lamin login testuser2
# load testuser1's instance using testuser2
!lamin load testuser1/mydata
✅ logged in with email testuser1@lamin.ai and id DzTjkKse
💡 deleting instance testuser1/mydata
🔶 could not delete as instance settings do not exist locally. did you provide a wrong instance name? could you try loading it?
💡 creating schemas: core==0.45.2 
🌱 saved: User(id='DzTjkKse', handle='testuser1', email='testuser1@lamin.ai', name='Test User1', updated_at=2023-08-12 05:45:48)
🌱 saved: Storage(id='kADkunWO', root='/home/runner/work/lamin-usecases/lamin-usecases/docs/mydata', type='local', updated_at=2023-08-12 05:45:48, created_by_id='DzTjkKse')
✅ loaded instance: testuser1/mydata
💡 did not register local instance on hub (if you want, call `lamin register`)

✅ logged in with email testuser2@lamin.ai and id bKeW4T6E
💡 found cached instance metadata: /home/runner/.lamin/instance--testuser1--mydata.env
🌱 saved: User(id='bKeW4T6E', handle='testuser2', email='testuser2@lamin.ai', name='Test User2', updated_at=2023-08-12 05:45:51)
✅ loaded instance: testuser1/mydata

import lamindb as ln
✅ loaded instance: testuser1/mydata (lamindb 0.50.3)
Hide code cell content
# To make the example of this guide richer, let's create data registered in uploads and pipeline runs by testuser1:
bfx_run_output = ln.dev.datasets.dir_scrnaseq_cellranger(
    "perturbseq", basedir=ln.settings.storage, output_only=False
)
ln.setup.login("testuser1")
transform = ln.Transform(name="Chromium 10x upload", type="pipeline")
ln.track(transform)
file1 = ln.File(bfx_run_output.parent / "fastq/perturbseq_R1_001.fastq.gz")
file1.save()
file2 = ln.File(bfx_run_output.parent / "fastq/perturbseq_R2_001.fastq.gz")
file2.save()
# let's now login testuser2 to start with guide
ln.setup.login("testuser2")
✅ logged in with email testuser1@lamin.ai and id DzTjkKse
🌱 saved: Transform(id='7QdAuXkrS6mIz8', name='Chromium 10x upload', stem_id='7QdAuXkrS6mI', version='0', type='pipeline', updated_at=2023-08-12 05:45:53, created_by_id='DzTjkKse')
🌱 saved: Run(id='eQjDbd5C6V61trGBYTae', run_at=2023-08-12 05:45:53, transform_id='7QdAuXkrS6mIz8', created_by_id='DzTjkKse')
💡 file in storage '/home/runner/work/lamin-usecases/lamin-usecases/docs/mydata' with key 'fastq/perturbseq_R1_001.fastq.gz'
💡 file in storage '/home/runner/work/lamin-usecases/lamin-usecases/docs/mydata' with key 'fastq/perturbseq_R2_001.fastq.gz'
✅ logged in with email testuser2@lamin.ai and id bKeW4T6E

Track a bioinformatics pipeline#

When working with a pipeline, we’ll register it before running it.

This only happens once and could be done by anyone on your team.

ln.Transform(name="Cell Ranger", version="7.2.0", type="pipeline").save()

Before running the pipeline, query or search for the corresponding transform record:

transform = ln.Transform.filter(name="Cell Ranger", version="7.2.0").one()

Pass the record to track() to set a global run_context:

ln.track(transform)
✅ loaded: Transform(id='ezGsYAqUOyESsM', name='Cell Ranger', stem_id='ezGsYAqUOyES', version='7.2.0', type='pipeline', updated_at=2023-08-12 05:45:54, created_by_id='bKeW4T6E')
🌱 saved: Run(id='CoRnH4SDheN14LhLMkXg', run_at=2023-08-12 05:45:54, transform_id='ezGsYAqUOyESsM', created_by_id='bKeW4T6E')

Now, let’s stage (download) a few files from an instrument upload:

files = ln.File.filter(key__startswith="fastq/perturbseq").all()
filepaths = [file.stage() for file in files]
💡 adding file IAIffwcqv4Av6edxFaG7 as input for run CoRnH4SDheN14LhLMkXg, adding parent transform 7QdAuXkrS6mIz8
💡 adding file t3Nnv4E8cTQNbxrdnzo1 as input for run CoRnH4SDheN14LhLMkXg, adding parent transform 7QdAuXkrS6mIz8

Assume we processed them and obtained 3 output files in a folder 'filtered_feature_bc_matrix':

ln.File.tree("./mydata/perturbseq/filtered_feature_bc_matrix/")
filtered_feature_bc_matrix (0 sub-directories & 3 files): 
├── features.tsv.gz
├── matrix.mtx.gz
└── barcodes.tsv.gz

output_files = ln.File.from_dir("./mydata/perturbseq/filtered_feature_bc_matrix/")
ln.save(output_files)
✅ created 3 files from directory using storage /home/runner/work/lamin-usecases/lamin-usecases/docs/mydata and key = perturbseq/filtered_feature_bc_matrix/
🌱 storing file '87fqD9f1GZUT2ZJ9IZIb' with key 'perturbseq/filtered_feature_bc_matrix/features.tsv.gz'
🌱 storing file 'cG78eIKkYrfcj5KDety9' with key 'perturbseq/filtered_feature_bc_matrix/matrix.mtx.gz'
🌱 storing file 'sVSh07hhgZTTIn9SX3aA' with key 'perturbseq/filtered_feature_bc_matrix/barcodes.tsv.gz'

Each of these files now has transform and run records. For instance:

output_files[0].transform
Transform(id='ezGsYAqUOyESsM', name='Cell Ranger', stem_id='ezGsYAqUOyES', version='7.2.0', type='pipeline', updated_at=2023-08-12 05:45:54, created_by_id='bKeW4T6E')
output_files[0].run
Run(id='CoRnH4SDheN14LhLMkXg', run_at=2023-08-12 05:45:54, transform_id='ezGsYAqUOyESsM', created_by_id='bKeW4T6E')

Let’s look at the data lineage at this stage:

output_files[0].view_lineage()
https://d33wubrfki0l68.cloudfront.net/3125eb18b328957948ecba8c957c77e9ebc1e6f5/ce2dc/_images/f16147c34dfedf0c195cb72f76c7a4bb7800cf8ce42ae0acbdb72f9ddfcf4b8f.svg

And let’s keep running the Cell Ranger pipeline in the background:

Hide code cell content
# continue with more precessing steps of the cell ranger output data
transform = ln.Transform(
    name="Preprocess Cell Ranger outputs", version="2.0", type="pipeline"
)
ln.track(transform)

[f.stage() for f in output_files]
filepath = ln.dev.datasets.schmidt22_perturbseq(basedir=ln.settings.storage)
file = ln.File(filepath, description="perturbseq counts")
file.save()
🌱 saved: Transform(id='P8YMJSNGbe9z0b', name='Preprocess Cell Ranger outputs', stem_id='P8YMJSNGbe9z', version='2.0', type='pipeline', updated_at=2023-08-12 05:45:54, created_by_id='bKeW4T6E')
🌱 saved: Run(id='ntpBREA7zLyoKlYnOkWe', run_at=2023-08-12 05:45:54, transform_id='P8YMJSNGbe9z0b', created_by_id='bKeW4T6E')
💡 adding file 87fqD9f1GZUT2ZJ9IZIb as input for run ntpBREA7zLyoKlYnOkWe, adding parent transform ezGsYAqUOyESsM
💡 adding file cG78eIKkYrfcj5KDety9 as input for run ntpBREA7zLyoKlYnOkWe, adding parent transform ezGsYAqUOyESsM
💡 adding file sVSh07hhgZTTIn9SX3aA as input for run ntpBREA7zLyoKlYnOkWe, adding parent transform ezGsYAqUOyESsM
💡 file in storage '/home/runner/work/lamin-usecases/lamin-usecases/docs/mydata' with key 'schmidt22_perturbseq.h5ad'
💡 file is AnnDataLike, consider using File.from_anndata() to link var_names and obs.columns as features

Track app upload & analytics#

The hidden cell below simulates additional analytic steps including:

  • uploading phenotypic screen data

  • scRNA-seq analysis

  • analyses of the integrated datasets

Hide code cell content
# app upload
ln.setup.login("testuser1")
transform = ln.Transform(name="Upload GWS CRISPRa result", type="app")
ln.track(transform)

# upload and analyze the GWS data
filepath = ln.dev.datasets.schmidt22_crispra_gws_IFNG(ln.settings.storage)
file = ln.File(filepath, description="Raw data of schmidt22 crispra GWS")
file.save()
ln.setup.login("testuser2")
transform = ln.Transform(name="GWS CRIPSRa analysis", type="notebook")
ln.track(transform)

file_wgs = ln.File.filter(key="schmidt22-crispra-gws-IFNG.csv").one()
df = file_wgs.load().set_index("id")
hits_df = df[df["pos|fdr"] < 0.01].copy()
file_hits = ln.File(hits_df, description="hits from schmidt22 crispra GWS")
file_hits.save()
✅ logged in with email testuser1@lamin.ai and id DzTjkKse
🌱 saved: Transform(id='nXCqeblqzJCIz8', name='Upload GWS CRISPRa result', stem_id='nXCqeblqzJCI', version='0', type='app', updated_at=2023-08-12 05:45:55, created_by_id='DzTjkKse')
🌱 saved: Run(id='4PRgkXShgitGEEUUs7L9', run_at=2023-08-12 05:45:55, transform_id='nXCqeblqzJCIz8', created_by_id='DzTjkKse')
💡 file in storage '/home/runner/work/lamin-usecases/lamin-usecases/docs/mydata' with key 'schmidt22-crispra-gws-IFNG.csv'
✅ logged in with email testuser2@lamin.ai and id bKeW4T6E
🌱 saved: Transform(id='vWgq1bwq94jFz8', name='GWS CRIPSRa analysis', stem_id='vWgq1bwq94jF', version='0', type='notebook', updated_at=2023-08-12 05:45:56, created_by_id='bKeW4T6E')
🌱 saved: Run(id='f9J4ZFdFmnIAL3kk6fG9', run_at=2023-08-12 05:45:56, transform_id='vWgq1bwq94jFz8', created_by_id='bKeW4T6E')
💡 adding file lP2qDxYjzi1ggGJvSRwe as input for run f9J4ZFdFmnIAL3kk6fG9, adding parent transform nXCqeblqzJCIz8
💡 file will be copied to default storage upon `save()` with key 'uNJ663ZfSXZNDB9uWrbx.parquet'
💡 file is a dataframe, consider using File.from_df() to link column names as features
🌱 storing file 'uNJ663ZfSXZNDB9uWrbx' with key '.lamindb/uNJ663ZfSXZNDB9uWrbx.parquet'

Let’s see how the data lineage of this looks:

file = ln.File.filter(description="hits from schmidt22 crispra GWS").one()
file.view_lineage()
https://d33wubrfki0l68.cloudfront.net/0a8bc8eb7e0d86489d75a0420d4989444a5a3ed0/73065/_images/f3e1c2aaac8ebc21320ede594365c86b02ad203eecbfeabafc0238f530d37b74.svg

Track notebooks#

In the backgound, somebody integrated and analyzed the outputs of the app upload and the Cell Ranger pipeline:

Hide code cell content
# Let us add analytics on top of the cell ranger pipeline and the phenotypic screening
transform = ln.Transform(
    name="Perform single cell analysis, integrating with CRISPRa screen",
    type="notebook",
)
ln.track(transform)

file_ps = ln.File.filter(description__icontains="perturbseq").one()
adata = file_ps.load()
screen_hits = file_hits.load()
import scanpy as sc

sc.tl.score_genes(adata, adata.var_names.intersection(screen_hits.index).tolist())
filesuffix = "_fig1_score-wgs-hits.png"
sc.pl.umap(adata, color="score", show=False, save=filesuffix)
filepath = f"figures/umap{filesuffix}"
file = ln.File(filepath, key=filepath)
file.save()
filesuffix = "fig2_score-wgs-hits-per-cluster.png"
sc.pl.matrixplot(
    adata, groupby="cluster_name", var_names=["score"], show=False, save=filesuffix
)
filepath = f"figures/matrixplot_{filesuffix}"
file = ln.File(filepath, key=filepath)
file.save()
🌱 saved: Transform(id='J6x5ZIyYoEkuz8', name='Perform single cell analysis, integrating with CRISPRa screen', stem_id='J6x5ZIyYoEku', version='0', type='notebook', updated_at=2023-08-12 05:45:56, created_by_id='bKeW4T6E')
🌱 saved: Run(id='GzzvnF7vAdNoTdlAkrLc', run_at=2023-08-12 05:45:56, transform_id='J6x5ZIyYoEkuz8', created_by_id='bKeW4T6E')
💡 adding file HbZdt18uc1X5Q1WhbXUk as input for run GzzvnF7vAdNoTdlAkrLc, adding parent transform P8YMJSNGbe9z0b
💡 adding file uNJ663ZfSXZNDB9uWrbx as input for run GzzvnF7vAdNoTdlAkrLc, adding parent transform vWgq1bwq94jFz8
WARNING: saving figure to file figures/umap_fig1_score-wgs-hits.png
💡 file will be copied to default storage upon `save()` with key 'figures/umap_fig1_score-wgs-hits.png'
🌱 storing file 'c2406U3gQFawDxgv5nK9' with key 'figures/umap_fig1_score-wgs-hits.png'
WARNING: saving figure to file figures/matrixplot_fig2_score-wgs-hits-per-cluster.png
💡 file will be copied to default storage upon `save()` with key 'figures/matrixplot_fig2_score-wgs-hits-per-cluster.png'
🌱 storing file '1LUsIItZ6djAJZ1t6IXu' with key 'figures/matrixplot_fig2_score-wgs-hits-per-cluster.png'

The outcome of it are a few figures stored as image files. Let’s query one of them and look at the data lineage:

file = ln.File.filter(key__contains="figures/matrixplot").one()
file.view_lineage()
https://d33wubrfki0l68.cloudfront.net/4612552a97bf14862c5c562f2add84531107bb78/3fa18/_images/e3ba1b40189f8f171e89f0f39fe14151599c06f84f496cd900fb89961562a465.svg

We’d now like to track the current Jupyter notebook to continue the work:

ln.track()
🌱 saved: Transform(id='1LCd8kco9lZUz8', name='Bird's eye view', short_name='birds-eye', stem_id='1LCd8kco9lZU', version='0', type=notebook, updated_at=2023-08-12 05:45:58, created_by_id='bKeW4T6E')
🌱 saved: Run(id='4jMS4YGDvJCbNm8ZSi5F', run_at=2023-08-12 05:45:58, transform_id='1LCd8kco9lZUz8', created_by_id='bKeW4T6E')

Let’s load the image file:

file.stage()
💡 adding file 1LUsIItZ6djAJZ1t6IXu as input for run 4jMS4YGDvJCbNm8ZSi5F, adding parent transform J6x5ZIyYoEkuz8
PosixPath('/home/runner/work/lamin-usecases/lamin-usecases/docs/mydata/figures/matrixplot_fig2_score-wgs-hits-per-cluster.png')

We see that the image file is tracked as an input of the current notebook. The input is highlighted, the notebook follows at the bottom:

file.view_lineage()
https://d33wubrfki0l68.cloudfront.net/27e053431554c3de055758ba7d2cf1924bb89d95/a4412/_images/9aed3ffaa6f550eac4d9dbd38e42b2552c3460e4c12557d92b7e3c6c156c667d.svg

We can also purely look at the sequence of transforms:

transform = ln.Transform.search("Track data lineage", return_queryset=True).first()
transform.parents.df()
name short_name stem_id version type reference updated_at created_by_id
id
7QdAuXkrS6mIz8 Chromium 10x upload None 7QdAuXkrS6mI 0 pipeline None 2023-08-12 05:45:53 DzTjkKse
transform.view_parents()
https://d33wubrfki0l68.cloudfront.net/f9939e46475402b4715fd184f954cff048f706f4/683ee/_images/dfccc46d29c713a056cf3cf141405a4dabc87ece8547a7bc6789979d00836e2e.svg

And if you or another user re-runs a notebook, they’ll be informed about parents in the logging:

ln.track()
✅ loaded: Transform(id='1LCd8kco9lZUz8', name='Bird's eye view', short_name='birds-eye', stem_id='1LCd8kco9lZU', version='0', type='notebook', updated_at=2023-08-12 05:45:58, created_by_id='bKeW4T6E')
✅ loaded: Run(id='4jMS4YGDvJCbNm8ZSi5F', run_at=2023-08-12 05:45:58, transform_id='1LCd8kco9lZUz8', created_by_id='bKeW4T6E')
💡   parent transform: Transform(id='J6x5ZIyYoEkuz8', name='Perform single cell analysis, integrating with CRISPRa screen', stem_id='J6x5ZIyYoEku', version='0', type='notebook', updated_at=2023-08-12 05:45:58, created_by_id='bKeW4T6E')

Data lineage graph#

To summarize, let’s re-render the data lineage graph:

file.view_lineage()
https://d33wubrfki0l68.cloudfront.net/27e053431554c3de055758ba7d2cf1924bb89d95/a4412/_images/9aed3ffaa6f550eac4d9dbd38e42b2552c3460e4c12557d92b7e3c6c156c667d.svg

Understand runs#

Under-the-hood we already tracked pipeline and notebook runs through run_context.

You can see this most easily by looking at the File.run attribute (in addition to File.transform).

File objects are the inputs and outputs of such runs.

Sometimes, we don’t want to create a global run context but manually pass a run when creating a file:

run = ln.Run(transform=transform)
ln.File(filepath, run=run)

When accessing a file via stage(), load() or backed(), two things happen:

  1. The current run gets added to file.input_of

  2. The transform of that file gets added as a parent of the current transform

While run outputs are automatically tracked as data sources once you call ln.track(), you can then still switch off auto-tracking of run inputs if you set ln.settings.track_run_inputs = False: Can I automatically track run inputs?

You can also track run inputs on a case by case basis via is_run_input=True, e.g., here:

file.load(is_run_input=True)

Query by provenance#

We can query or search for the notebook that created the file:

transform = ln.Transform.search("GWS CRIPSRa analysis", return_queryset=True).first()

And then find all the files created by that notebook:

ln.File.filter(transform=transform).df()
storage_id key suffix accessor description version initial_version_id size hash hash_type transform_id run_id updated_at created_by_id
id
uNJ663ZfSXZNDB9uWrbx kADkunWO None .parquet DataFrame hits from schmidt22 crispra GWS None None 18368 yw5f-kMLJhaNhdEF-lhxOQ md5 vWgq1bwq94jFz8 f9J4ZFdFmnIAL3kk6fG9 2023-08-12 05:45:56 bKeW4T6E

Which transform ingested a given file?

file = ln.File.filter().first()
file.transform
Transform(id='7QdAuXkrS6mIz8', name='Chromium 10x upload', stem_id='7QdAuXkrS6mI', version='0', type='pipeline', updated_at=2023-08-12 05:45:53, created_by_id='DzTjkKse')

And which user?

file.created_by
User(id='DzTjkKse', handle='testuser1', email='testuser1@lamin.ai', name='Test User1', updated_at=2023-08-12 05:45:48)

Which transforms were created by a given user?

users = ln.User.lookup(field="handle")
ln.Transform.filter(created_by=users.testuser2).df()
name short_name stem_id version type reference updated_at created_by_id
id
ezGsYAqUOyESsM Cell Ranger None ezGsYAqUOyES 7.2.0 pipeline None 2023-08-12 05:45:54 bKeW4T6E
P8YMJSNGbe9z0b Preprocess Cell Ranger outputs None P8YMJSNGbe9z 2.0 pipeline None 2023-08-12 05:45:54 bKeW4T6E
vWgq1bwq94jFz8 GWS CRIPSRa analysis None vWgq1bwq94jF 0 notebook None 2023-08-12 05:45:56 bKeW4T6E
J6x5ZIyYoEkuz8 Perform single cell analysis, integrating with... None J6x5ZIyYoEku 0 notebook None 2023-08-12 05:45:58 bKeW4T6E
1LCd8kco9lZUz8 Bird's eye view birds-eye 1LCd8kco9lZU 0 notebook None 2023-08-12 05:45:58 bKeW4T6E

Which notebooks were created by a given user?

ln.Transform.filter(created_by=users.testuser2, type="notebook").df()
name short_name stem_id version type reference updated_at created_by_id
id
vWgq1bwq94jFz8 GWS CRIPSRa analysis None vWgq1bwq94jF 0 notebook None 2023-08-12 05:45:56 bKeW4T6E
J6x5ZIyYoEkuz8 Perform single cell analysis, integrating with... None J6x5ZIyYoEku 0 notebook None 2023-08-12 05:45:58 bKeW4T6E
1LCd8kco9lZUz8 Bird's eye view birds-eye 1LCd8kco9lZU 0 notebook None 2023-08-12 05:45:58 bKeW4T6E

And of course, we can also view all recent additions to the entire database:

ln.view()
Hide code cell output
File

storage_id key suffix accessor description version initial_version_id size hash hash_type transform_id run_id updated_at created_by_id
id
1LUsIItZ6djAJZ1t6IXu kADkunWO figures/matrixplot_fig2_score-wgs-hits-per-clu... .png None None None None 28814 JYIPcat0YWYVCX3RVd3mww md5 J6x5ZIyYoEkuz8 GzzvnF7vAdNoTdlAkrLc 2023-08-12 05:45:58 bKeW4T6E
c2406U3gQFawDxgv5nK9 kADkunWO figures/umap_fig1_score-wgs-hits.png .png None None None None 118999 laQjVk4gh70YFzaUyzbUNg md5 J6x5ZIyYoEkuz8 GzzvnF7vAdNoTdlAkrLc 2023-08-12 05:45:57 bKeW4T6E
uNJ663ZfSXZNDB9uWrbx kADkunWO None .parquet DataFrame hits from schmidt22 crispra GWS None None 18368 yw5f-kMLJhaNhdEF-lhxOQ md5 vWgq1bwq94jFz8 f9J4ZFdFmnIAL3kk6fG9 2023-08-12 05:45:56 bKeW4T6E
lP2qDxYjzi1ggGJvSRwe kADkunWO schmidt22-crispra-gws-IFNG.csv .csv None Raw data of schmidt22 crispra GWS None None 1729685 cUSH0oQ2w-WccO8_ViKRAQ md5 nXCqeblqzJCIz8 4PRgkXShgitGEEUUs7L9 2023-08-12 05:45:55 DzTjkKse
HbZdt18uc1X5Q1WhbXUk kADkunWO schmidt22_perturbseq.h5ad .h5ad AnnData perturbseq counts None None 20659936 la7EvqEUMDlug9-rpw-udA md5 P8YMJSNGbe9z0b ntpBREA7zLyoKlYnOkWe 2023-08-12 05:45:54 bKeW4T6E
sVSh07hhgZTTIn9SX3aA kADkunWO perturbseq/filtered_feature_bc_matrix/barcodes... .tsv.gz None None None None 6 4rKXb9tuQUWnNTH2ZUh45g md5 ezGsYAqUOyESsM CoRnH4SDheN14LhLMkXg 2023-08-12 05:45:54 bKeW4T6E
cG78eIKkYrfcj5KDety9 kADkunWO perturbseq/filtered_feature_bc_matrix/matrix.m... .mtx.gz None None None None 6 HcVzgqt2nTf495vRMf4_Cw md5 ezGsYAqUOyESsM CoRnH4SDheN14LhLMkXg 2023-08-12 05:45:54 bKeW4T6E
87fqD9f1GZUT2ZJ9IZIb kADkunWO perturbseq/filtered_feature_bc_matrix/features... .tsv.gz None None None None 6 HXR7vE6fTla-rTkwXG8I-A md5 ezGsYAqUOyESsM CoRnH4SDheN14LhLMkXg 2023-08-12 05:45:54 bKeW4T6E
t3Nnv4E8cTQNbxrdnzo1 kADkunWO fastq/perturbseq_R2_001.fastq.gz .fastq.gz None None None None 6 dTFVvtlfDLFXd-8qsandFw md5 7QdAuXkrS6mIz8 eQjDbd5C6V61trGBYTae 2023-08-12 05:45:53 DzTjkKse
IAIffwcqv4Av6edxFaG7 kADkunWO fastq/perturbseq_R1_001.fastq.gz .fastq.gz None None None None 6 ZyFkKerhv1D1kT9z_aMl1g md5 7QdAuXkrS6mIz8 eQjDbd5C6V61trGBYTae 2023-08-12 05:45:53 DzTjkKse
Run

transform_id run_at created_by_id reference reference_type
id
eQjDbd5C6V61trGBYTae 7QdAuXkrS6mIz8 2023-08-12 05:45:53 DzTjkKse None None
CoRnH4SDheN14LhLMkXg ezGsYAqUOyESsM 2023-08-12 05:45:54 bKeW4T6E None None
ntpBREA7zLyoKlYnOkWe P8YMJSNGbe9z0b 2023-08-12 05:45:54 bKeW4T6E None None
4PRgkXShgitGEEUUs7L9 nXCqeblqzJCIz8 2023-08-12 05:45:55 DzTjkKse None None
f9J4ZFdFmnIAL3kk6fG9 vWgq1bwq94jFz8 2023-08-12 05:45:56 bKeW4T6E None None
GzzvnF7vAdNoTdlAkrLc J6x5ZIyYoEkuz8 2023-08-12 05:45:56 bKeW4T6E None None
4jMS4YGDvJCbNm8ZSi5F 1LCd8kco9lZUz8 2023-08-12 05:45:58 bKeW4T6E None None
Storage

root type region updated_at created_by_id
id
kADkunWO /home/runner/work/lamin-usecases/lamin-usecase... local None 2023-08-12 05:45:51 bKeW4T6E
Transform

name short_name stem_id version type reference updated_at created_by_id
id
1LCd8kco9lZUz8 Bird's eye view birds-eye 1LCd8kco9lZU 0 notebook None 2023-08-12 05:45:58 bKeW4T6E
J6x5ZIyYoEkuz8 Perform single cell analysis, integrating with... None J6x5ZIyYoEku 0 notebook None 2023-08-12 05:45:58 bKeW4T6E
vWgq1bwq94jFz8 GWS CRIPSRa analysis None vWgq1bwq94jF 0 notebook None 2023-08-12 05:45:56 bKeW4T6E
nXCqeblqzJCIz8 Upload GWS CRISPRa result None nXCqeblqzJCI 0 app None 2023-08-12 05:45:55 DzTjkKse
P8YMJSNGbe9z0b Preprocess Cell Ranger outputs None P8YMJSNGbe9z 2.0 pipeline None 2023-08-12 05:45:54 bKeW4T6E
ezGsYAqUOyESsM Cell Ranger None ezGsYAqUOyES 7.2.0 pipeline None 2023-08-12 05:45:54 bKeW4T6E
7QdAuXkrS6mIz8 Chromium 10x upload None 7QdAuXkrS6mI 0 pipeline None 2023-08-12 05:45:53 DzTjkKse
User

handle email name updated_at
id
bKeW4T6E testuser2 testuser2@lamin.ai Test User2 2023-08-12 05:45:51
DzTjkKse testuser1 testuser1@lamin.ai Test User1 2023-08-12 05:45:48
Hide code cell content
!lamin login testuser1
!lamin delete mydata
!rm -r ./mydata
✅ logged in with email testuser1@lamin.ai and id DzTjkKse
💡 deleting instance testuser1/mydata
✅     deleted instance settings file: /home/runner/.lamin/instance--testuser1--mydata.env
✅     instance cache deleted
✅     deleted '.lndb' sqlite file
🔶     consider manually delete your stored data: /home/runner/work/lamin-usecases/lamin-usecases/docs/mydata