ESAP-DB is an ESAP component providing managed database services.
In order to use ESAP-DB in Python, a client needs to be instantiated:
from esap_client import ESAPClient
client = ESAPClient()
A project defines a scope, in which collections of tables can be created. Let's create our first project:
project = client.create_project('my_project', 'My first project 😀')
project.describe()
Name my_project Description My first project 😀 Type user Max size 10.0 GiB Datasets []
In this project, let's create a dataset, i.e., a group of tables:
dataset = project.create_dataset('my_dataset', 'My first dataset 😃')
dataset.describe()
Name my_project.my_dataset Description My first dataset 😃 Tables []
We can check that the dataset has been created inside our project:
project.datasets
[<Dataset my_project.my_dataset>]
Pandas DataFrames can be used as a data source.
import numpy as np
import pandas as pd
df = pd.DataFrame({
'a': np.random.randn(10),
'b': np.random.randn(10),
'c': np.random.randn(10),
})
table = dataset.create_table_from(df, "my_pandas_table")
table.describe()
Name my_project.my_dataset.my_pandas_table Description Shape 10 rows x 3 columns
We can check that the table has been added to the dataset
dataset.describe()
Name my_project.my_dataset Description My first dataset 😃 Tables my_pandas_table
The tables in a dataset can be accessed
dataset.tables
[<Table my_project.my_dataset.my_pandas_table>]
The object table
does not hold any data, it is a proxy object to an actual database table. To download it as a Pandas dataframe:
table.aspandas()
a | b | c | |
---|---|---|---|
0 | -0.017895 | 0.581371 | 0.067971 |
1 | -0.387042 | -0.280934 | -0.035224 |
2 | -0.528579 | -1.361485 | 1.209838 |
3 | -0.058013 | -0.523124 | 0.008259 |
4 | -0.777601 | -1.650967 | -0.777421 |
5 | -0.214714 | 0.372448 | 0.480047 |
6 | -0.061562 | 0.730865 | -0.465868 |
7 | 0.475183 | 1.413025 | 0.498238 |
8 | 0.536889 | 1.929530 | -0.758486 |
9 | 1.044491 | -0.485938 | -0.638685 |
External CSV files can be imported into a table:
url = 'https://www.data.gouv.fr/fr/datasets/r/c0f59f00-3ab2-4f31-8a05-d317b43e9055'
table = dataset.create_table_from(url, sep=';')
table.describe()
Name my_project.my_dataset.sp_18ans_dep_2021_08_04_19h06 Description https://www.data.gouv.fr/fr/datasets/r/c0f59f00-3ab2-4f31-8a05-d317b43e9055 Shape 45494 rows x 6 columns
table.aspandas()
dep | semaine | age_18ans | ti | tp | td | |
---|---|---|---|---|---|---|
0 | 01 | 2020-S21 | 2 | 0.00 | 0.00 | 217.02 |
1 | 01 | 2020-S21 | 5 | 0.00 | 0.00 | 161.19 |
2 | 01 | 2020-S21 | 10 | 0.00 | 0.00 | 159.85 |
3 | 01 | 2020-S21 | 14 | 0.00 | 0.00 | 107.83 |
4 | 01 | 2020-S21 | 17 | 3.98 | 2.17 | 183.19 |
... | ... | ... | ... | ... | ... | ... |
45489 | 974 | 2021-S30 | 0 | 393.51 | 9.17 | 4291.37 |
45490 | 975 | 2021-S30 | 0 | 16.68 | 1.96 | 850.43 |
45491 | 976 | 2021-S30 | 0 | 12.17 | 1.00 | 1212.29 |
45492 | 977 | 2021-S30 | 0 | 1686.58 | 8.70 | 19375.56 |
45493 | 978 | 2021-S30 | 0 | 322.64 | 6.43 | 5015.00 |
45494 rows × 6 columns
Inside the same project, the tables from different datasets can be combined in a query.
import pandas as pd
dataset1 = project.create_dataset('dataset1')
df_fruits = pd.DataFrame({'x': 6 * ['fruit'], 'y': list('🍓🥝🍇🍐🍏🍍')})
fruits = dataset1.create_table_from(df_fruits, 'fruits')
fruits.aspandas()
x | y | |
---|---|---|
0 | fruit | 🍓 |
1 | fruit | 🥝 |
2 | fruit | 🍇 |
3 | fruit | 🍐 |
4 | fruit | 🍏 |
5 | fruit | 🍍 |
dataset2 = project.create_dataset('dataset2')
df_vegetables = pd.DataFrame({'x': 5 * ['vegetable'], 'y': list('🥑🌽🥒🍆🥦')})
vegetables = dataset2.create_table_from(df_vegetables, 'vegetables')
vegetables.aspandas()
x | y | |
---|---|---|
0 | vegetable | 🥑 |
1 | vegetable | 🌽 |
2 | vegetable | 🥒 |
3 | vegetable | 🍆 |
4 | vegetable | 🥦 |
dataset3 = project.create_dataset('dataset3')
query = """
SELECT * FROM dataset1.fruits
UNION
SELECT * FROM dataset2.vegetables
ORDER BY x, y
"""
food = dataset3.create_table_as(query, 'food')
food.aspandas()
x | y | |
---|---|---|
0 | fruit | 🍇 |
1 | fruit | 🍍 |
2 | fruit | 🍏 |
3 | fruit | 🍐 |
4 | fruit | 🍓 |
5 | fruit | 🥝 |
6 | vegetable | 🥦 |
7 | vegetable | 🌽 |
8 | vegetable | 🍆 |
9 | vegetable | 🥑 |
10 | vegetable | 🥒 |
We can store the result of an ESAP-API query into a table that belongs to the dataset that we have just created.
query = {
"level": "raw",
"collection": "imaging",
"ra": 342.16,
"dec": 33.94,
"fov": 10,
"archive_uri": "apertif"
}
table = dataset.create_table_from_esap_gateway_query(query, 'apertif_table', 'Apertif cone search')
table.aspandas()
name | PID | dataProductType | dataProductSubType | generatedByActivity | datasetID | RA | dec | fov | release | thumbnail | storageRef | url | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | WSRTA190409015_B000.MS | pid-WSRTA190409015_B000.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/289169/ | 190409015 | 342.162500 | 33.944444 | 0.2 | SVC_2019_Imaging | https://alta.astron.nl | cold:190409015/WSRTA190409015_B000.MS.tar | https://alta.astron.nl/webdav/SVC_2019_Imaging... |
1 | WSRTA190409015_B001.MS | pid-WSRTA190409015_B001.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/289169/ | 190409015 | 340.350988 | 32.918112 | 0.2 | SVC_2019_Imaging | https://alta.astron.nl | cold:190409015/WSRTA190409015_B001.MS.tar | https://alta.astron.nl/webdav/SVC_2019_Imaging... |
2 | WSRTA190409015_B002.MS | pid-WSRTA190409015_B002.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/289169/ | 190409015 | 340.908159 | 32.924915 | 0.2 | SVC_2019_Imaging | https://alta.astron.nl | cold:190409015/WSRTA190409015_B002.MS.tar | https://alta.astron.nl/webdav/SVC_2019_Imaging... |
3 | WSRTA190409015_B003.MS | pid-WSRTA190409015_B003.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/289169/ | 190409015 | 341.465567 | 32.929246 | 0.2 | SVC_2019_Imaging | https://alta.astron.nl | cold:190409015/WSRTA190409015_B003.MS.tar | https://alta.astron.nl/webdav/SVC_2019_Imaging... |
4 | WSRTA190409015_B004.MS | pid-WSRTA190409015_B004.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/289169/ | 190409015 | 342.023107 | 32.931102 | 0.2 | SVC_2019_Imaging | https://alta.astron.nl | cold:190409015/WSRTA190409015_B004.MS.tar | https://alta.astron.nl/webdav/SVC_2019_Imaging... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
526 | WSRTA200624213_B035.MS | pid-WSRTA200624213_B035.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/1988825/ | 200624213 | 341.052670 | 30.092919 | 0.2 | APERTIF_DR1_Imaging | None | cold:200624213/WSRTA200624213_B035.MS.tar | https://alta.astron.nl/webdav/APERTIF_DR1_Imag... |
527 | WSRTA200624213_B036.MS | pid-WSRTA200624213_B036.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/1988825/ | 200624213 | 341.593556 | 30.093473 | 0.2 | APERTIF_DR1_Imaging | None | cold:200624213/WSRTA200624213_B036.MS.tar | https://alta.astron.nl/webdav/APERTIF_DR1_Imag... |
528 | WSRTA200624213_B037.MS | pid-WSRTA200624213_B037.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/1988825/ | 200624213 | 342.134418 | 30.091812 | 0.2 | APERTIF_DR1_Imaging | None | cold:200624213/WSRTA200624213_B037.MS.tar | https://alta.astron.nl/webdav/APERTIF_DR1_Imag... |
529 | WSRTA200624213_B038.MS | pid-WSRTA200624213_B038.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/1988825/ | 200624213 | 342.675160 | 30.087937 | 0.2 | APERTIF_DR1_Imaging | None | cold:200624213/WSRTA200624213_B038.MS.tar | https://alta.astron.nl/webdav/APERTIF_DR1_Imag... |
530 | WSRTA200624213_B039.MS | pid-WSRTA200624213_B039.MS | visibility | uncalibratedVisibility | http://alta.astron.nl/altapi/activities/1988825/ | 200624213 | 343.215685 | 30.081850 | 0.2 | APERTIF_DR1_Imaging | None | cold:200624213/WSRTA200624213_B039.MS.tar | https://alta.astron.nl/webdav/APERTIF_DR1_Imag... |
531 rows × 13 columns
All the TAP services can be access through ESAP-DB. Queries can be stored as a table in a user's dataset.
A VO table is referenced in ESAP-DB by tap_service.schema_name.table_name
.
client.describe()
Projects aip_gavo_org_tap, archive_stsci_edu_caomtap, archive_stsci_edu_hsctap, archive_stsci_edu_hscv2tap, archive_stsci_edu_hscv3tap, archive_stsci_edu_ps1dr2tap, archive_stsci_edu_regtap, astro_ucl_ac_uk_tap, astron_nl_tap, asu_cas_cz_tap, au_csiro_atoavo_tap, au_csiro_casda_tap, au_csiro_psrda_atnf_pulsar_data_archive, bira_iasb_tap, byu_arvo_tap, cadc_nrc_ca_argus, cadc_nrc_ca_youcat, cdpp_amda, cdpp_illu67p, cdpp_transplanet, cds_simbad_tap, cds_vizier_obstap, cds_vizier_tap, cefca_j_plus_j_plus_dr1, fu_berlin_planet_hrsc_tap, cefca_j_plus_j_plus_dr2, cefca_minijpas_minij_pas_pdr201912, chivo_tap, cxc_harvard_edu_cda, cxc_harvard_edu_csc, cxc_harvard_edu_cscr1, gaia_aip_de_tap, geops_ipsl_tap, cxc_harvard_edu_cscr2, esavo_ehst_tap, esavo_esasky_tap, esavo_esaskylegacy_tap, esavo_gaia_tap, esavo_hsa_tap, esavo_isla_obsloctap, esavo_iso_tap, esavo_psa_epntap, esavo_registry_tap, esavo_xmm_tap, eso_org_tap_cat, eso_org_tap_obs, nci_org_au_macho_tap, ia2_inaf_it_hosted_laurino2011_tap, ia2_inaf_it_iaps_tap, ia2_inaf_it_tap, ia2_inaf_it_tap_projects, iap_tap, idoc_tap, irsa_ipac_tap, jao_alma_tap, jvo_agn, jvo_alma, jvo_isas_akari_fis_v1, jvo_isas_darts_halca_halca_vsop_correlated_data, jvo_isas_darts_hitomi_hitomaster_v1, nci_org_au_skymapper_tap, nci_org_au_wigglez_tap, ned_ipac_tap, oca_mp3c, org_gavo_dc_tap, jvo_kug2000, jvo_nobeyama, jvo_saga, jvo_subaru_hds, jvo_subaru_moircs, jvo_subaru_spcam, konkoly_hu_tap, lam_cesam_tap, latmos_ipsl_tap, lmd_jussieu_tap, musewide_aip_de_tap, nasa_heasarc_services_xamin, osug_vo_osug_tap, ov_gso_climso, purx_tap, pvol_tap, sao_ru_dsa_cats_wsdb, spectrum_iaa_tap, src_pas_tap, swinburne_tao, tohoku_univ_jp_tap, uni_heidelberg_de_gaia_tap, vo_plasma_oeaw_ac_at_tap, vopdc_obspm_gepi_gaia, vopdc_obspm_imcce_dynastvo_epn, vopdc_obspm_imcce_m4ast, vopdc_obspm_lesia_apis_epn, vopdc_obspm_lesia_basecom_epn, vopdc_obspm_lesia_bdip_epn, vopdc_obspm_lesia_hfc1ar_epn, vopdc_obspm_lesia_hfc1t3_epn, vopdc_obspm_lesia_iks_epn, vopdc_obspm_lesia_planets_epn, vopdc_obspm_lesia_titan_epn, vopdc_obspm_lesia_tnosarecool_epn
data_sources = [
('radio', 'astron_nl_tap.apertif_dr1.spectral_cubes', 'centeralpha', 'centerdelta', 10.),
('optical', 'esavo_gaia_tap.gaiaedr3.gaia_source', 'ra', 'dec', 1.),
('neutrino', 'org_gavo_dc_tap.icecube.nucand', 'nualpha', 'nudelta', 10.),
]
query = """
SELECT *
FROM {catalog}
WHERE 1 = CONTAINS(
POINT('ICRS', {ra}, {dec}),
CIRCLE('ICRS', 207.5, 52.5, {radius})
)
"""
tables = {}
for name, catalog, ra, dec, radius in data_sources:
print(f'Cross-matching {name} source...', end='')
q = query.format(catalog=catalog, ra=ra, dec=dec, radius=radius)
tables[name] = dataset.create_table_as(q, name)
print(f' {len(tables[name])} results.')
Cross-matching radio source... 1908 results. Cross-matching optical source... 9667 results. Cross-matching neutrino source... 164 results.
tables['optical'].describe()
Name my_project.my_dataset.optical Description SELECT * FROM esavo_gaia_tap.gaiaedr3.gaia_source WHERE 1 = CONTAINS( POINT('ICRS', ra, dec), CIRCLE('ICRS', 207.5, 52.5, 1.0) ) Shape 9667 rows x 98 columns
tables['optical'].aspandas()
solution_id | source_id | random_index | ref_epoch | ra | ra_error | dec | dec_error | parallax | parallax_error | ... | dr2_radial_velocity | dr2_radial_velocity_error | dr2_rv_nb_transits | dr2_rv_template_teff | dr2_rv_template_logg | dr2_rv_template_fe_h | l | b | ecl_lon | ecl_lat | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1636042515805110273 | 1512468369600475904 | 420509426 | 2016 | 207.896220 | 0.305644 | 51.531989 | 0.304428 | 0.587519 | 0.373524 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 102.042561 | 63.087629 | 175.382276 | 56.524502 |
1 | 1636042515805110273 | 1512469331673152896 | 1648021090 | 2016 | 208.022016 | 0.370128 | 51.555304 | 0.394424 | 1.835239 | 0.455937 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 101.920539 | 63.027566 | 175.464673 | 56.592342 |
2 | 1636042515805110273 | 1512469335969266816 | 1558956725 | 2016 | 208.017081 | 0.225193 | 51.554426 | 0.230831 | 0.225695 | 0.280723 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 101.925352 | 63.029894 | 175.461398 | 56.589709 |
3 | 1636042515805110273 | 1512469400392115584 | 1784132731 | 2016 | 208.056547 | 0.592267 | 51.577344 | 0.618987 | 0.283812 | 0.724402 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 101.904894 | 62.997632 | 175.469162 | 56.623008 |
4 | 1636042515805110273 | 1512469812709488128 | 1697265783 | 2016 | 207.936055 | 0.248966 | 51.541328 | 0.254912 | 0.225448 | 0.315043 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 102.006041 | 63.066948 | 175.406104 | 56.547491 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9662 | 1636042515805110273 | 1561149349999298176 | 1277897740 | 2016 | 207.763663 | 0.097224 | 53.460415 | 0.109553 | 1.684692 | 0.127503 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 104.243268 | 61.451272 | 172.951988 | 57.937282 |
9663 | 1636042515805110273 | 1561149457374185600 | 370339584 | 2016 | 207.814254 | 0.026466 | 53.467603 | 0.027957 | 2.013792 | 0.033770 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 104.195197 | 61.430519 | 172.985488 | 57.962635 |
9664 | 1636042515805110273 | 1561149491732941440 | 1021411229 | 2016 | 207.788757 | 0.736398 | 53.466885 | 1.011097 | 2.457080 | 1.010273 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 104.222324 | 61.438433 | 172.964974 | 57.952032 |
9665 | 1636042515805110273 | 1561149560453834368 | 26286277 | 2016 | 207.748082 | 0.021362 | 53.468960 | 0.022575 | 0.492876 | 0.027242 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 104.268866 | 61.448210 | 172.928235 | 57.937528 |
9666 | 1636042515805110273 | 1561149659236282240 | 713259813 | 2016 | 207.728150 | 0.373806 | 53.487870 | 0.434466 | -0.943177 | 0.497549 | ... | NaN | NaN | 0 | NaN | NaN | NaN | 104.309571 | 61.437271 | 172.887881 | 57.943812 |
9667 rows × 98 columns