Download tags from ExoFOP#
This notebook demonstrate how to bulk download time series observations from ExoFOP and then organize the downloaded data.
Setup#
We’ll start by importing the necessary modules.
import numpy as np
import exofop
from exofop.download import ExoFOPAuthenticator, System, SystemDownloader
# For nice logs, we reformat the root loggger's StreamHandler
from exofop.utils.logger import configure_root_logger
configure_root_logger()
Login to ExoFOP#
Accessing data on ExoFOP does not necessarily require logging in, but having valid credentials can be beneficial as certain types of data uploaded to ExoFOP are eligible for a proprietary period of 12 months within the tfopwg group. After 12 months, the data become publicly available unless the uploader decides to extend the proprietary period.
ⓘ If you do not have an account, set
authenticator = None
and continue this tutorial at Download Observations.
The ExoFOPAuthenticator
class#
As one naturally is uncomfortable entering credentials, let’s have a look at how it
works.
ExoFOPAuthenticator
handles authentication by interacting with the ExoFOP website,
storing and managing session cookies, and providing methods to log in,
if the session cookies have expired.
During initialisation, the class checks if valid cookies are available. If they are not,
login is attempted using one of the following methods
provide the username during initialisation and prompt the password during login
use the username and password provided from a credential file
The password is never stored in the object, merely the cookies generated upon successful login.
So let us initialise an authenticator providing a username
.
Here we have explicitly handed cookies_dir
its default value, to draw attention to the fact that
your cookies will be stored in your local installation of the package unless you specify otherwise.
If this installation is shared with other users, you might consider storing your cookies elsewhere,
by changing the provided directory path.
authenticator = ExoFOPAuthenticator(
username='lovelace',
cookies_dir=exofop.COOKIES_DIR, # default
credential_file_path=None,
number_of_cookie_jars=3,
)
INFO:exofop.download.authenticator:Cookies will expire in 2 days, 2:46:22.418284
Next, let us log in to our ExoFOP account.
If there are no valid cookies for this user name in the specified directory, you will be prompted to enter your password – unless you have initialised the object with a credential file in which you specify your username and password located at credential_file_path
(see Note in Appendix).
success = authenticator.login()
print(f'Login success: {success}')
print(authenticator)
INFO:exofop.download.authenticator:Cookies are still valid. No need to login.
Login success: True
ExoFOPAuthenticator(username=lovelace, number_of_cookie_jars=3, remaining_validity=2 days, 2:46:22.394646)
Download Observations#
In the following section, we will download multiple time series observations of a single stellar system.
Identify a system#
Individual systems are identified in this package using the `System’ class, which helps to handle both TIC and TOI identifiers that may be used to refer to the system.
The class can be initialised by passing a system name, TIC ID and/or TOI ID.
A useful method provided by this class is autocomplete()
, which completes the missing ID (TIC or TOI) by performing a lookup in a table loaded by ExoFOP.
# Initialise a system
system = System(name='TOI_1130', tic='254113311', toi=None)
print(f"Target name: {system.target}")
print(system)
# Alternative ways to initialise a system
# As the first argument is name, we could also hand IDs using the prefixes 'TIC' or 'TOI'.
# system = System(name="TOI-1130") # Indicate TOI ID with prefix 'TOI'
# system = System("TIC_254113311") # Indicate TIC ID with prefix 'TIC'
# Less recommended is to give an id without prefix, as it requires an extra lookup
# System(name="254113311")
Target name: 254113311
System(name=TOI_1130, tic=TIC_254113311 toi=None)
Download data from a system#
Next we initialise a SystemDownloader
, which is the main object for downloading data from ExoFOP that is associated to a single target.
The downloader works asynchronously, that means multiple requests can be done simultaneously, shortening the download speed.
data_dir = '.'
system_loader = SystemDownloader(
system=system,
data_dir=data_dir,
authenticator=authenticator, # optional
max_concurrent_downloads=5,
)
Lets have a look at the time series data on ExoFOP for this target.
tab = system_loader.time_series.table
tab[:5]
TIC ID | TIC | TOI | Telescope | Camera | Filter | Pixel Scale | PSF | Phot Aperture Rad | Obs Date | Obs Duration | Num Obs | Obs Type | Transit Coverage | Delta Mag | User | Group | Tag | Notes |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
int64 | int64 | str11 | str24 | str24 | str10 | float64 | float64 | float64 | str10 | float64 | int64 | str10 | str14 | float64 | str9 | str6 | int64 | str52 |
254113311 | -- | TOI 1130.02 | PEST (0.305 m) | ST-8XME | Rc | 1.23 | 4.48 | 7.0 | 2020-03-30 | 230.0 | 184 | Continuous | Full | 8.3 | tan | tfopwg | 18493 | -- |
254113311 | -- | TOI 1130.02 | LCO-SSO-1.0m (1.0 m) | Sinistro, 1.0m | zs | 0.389 | 1.8 | 9.0 | 2020-03-30 | 177.0 | 167 | Continuous | Full | 9.8 | stockdale | tfopwg | 18598 | -- |
254113311 | -- | TOI 1130.02 | LCO-SSO (1.0 m) | Sinistro | zs | 0.39 | 2.44 | 13.0 | 2020-05-14 | 205.0 | 192 | Continuous | Full | 7.36 | schwarz | tfopwg | 19121 | -- |
254113311 | -- | TOI 1130.02 | LCO-SSO-1m0 (1.0 m) | SINISTRO | zs | 0.389 | 2.1 | 6.0 | 2020-05-14 | 249.0 | 235 | Continuous | Full | 7.3 | conti | tfopwg | 19188 | -- |
254113311 | -- | TOI 1130.02 | LCO-SAAO-1m0 (1.0 m) | SINISTRO | zs | 0.389 | 1.81 | 15.0 | 2020-06-07 | 272.0 | 260 | Continuous | Full | -- | collins | tfopwg | 19442 | nominal .02 with overlapping .01 |
Note that astropy.table.Table
can be easily converted to pd.DataFrames
using tab.to_pandas()
.
Besides the time_series
data, we can load the following overview tables:
system_loader.imaging.table
system_loader.spectroscopy.table
system_loader.light_curve.table
system_loader.stellar_parameters.table
system_loader.nearby_target.table
We can get the tags given in the tables of any of the above attributes by replacing table
with tags
, for example:
tags = np.sort(system_loader.time_series.tags)
tags
array([ 5734, 5903, 6038, 6038, 6384, 6384, 18493, 18598,
19121, 19188, 19261, 19319, 19425, 19429, 19435, 19441,
19442, 19442, 19914, 28680, 28680, 29207, 29208, 29262,
33584, 33677, 96267, 96799, 97594, 97874, 97963, 97967,
97973, 98598, 98692, 98844, 98852, 98881, 99828, 107605,
424251, 428772, 428867, 428875, 429211, 429363, 430115, 431063,
431370, 431371, 431527, 431654, 432164, 432682, 432703, 433095,
433104, 433655, 437156])
Next, let us downlaod the first two of these tags using the download
method.
This will download the .zip
files to system_loader.zip_dir
, which lies inside of system_loader.target_dir
. If unzip=True
, the .zip
files will be unpacked to output_dir
, which defaults to target_dir
. Therefore, the following lines of code should result in
data_dir/ # as given to system_loader
├──TOI_1130/ # target_dir, named using `system.name`
│ ├── 5734/
│ ├── 5903/
│ ├── zip/
│ │ ├── 5734.zip
│ │ ├── 5903.zip
As we are running this code from a Jupyter notebook which always has an open event loop, we need to use await
before calling the download method.
await system_loader.download(tags[:2], output_dir=None, unzip=True) # type: ignore
print(f"Target directory: {system_loader.target_dir}")
INFO:httpx:HTTP Request: GET https://exofop.ipac.caltech.edu/tess/download_tag_files_zip.php?tag=5903 "HTTP/1.1 200 OK"
INFO:exofop.download:file_name='5903.zip': OK
ERROR:exofop.download:Request error:
WARNING:exofop.download:Retrying download (1/2) for https://exofop.ipac.caltech.edu/tess/download_tag_files_zip.php?tag=5734
ERROR:exofop.download:Request error:
INFO:exofop.download:file_name='5734.zip': TIMEOUT
INFO:exofop.download.downloaders:Downloaded 1 of 2 files (50.0 %) successfully.
INFO:exofop.download.downloaders:Successfully downloaded 1 files in 11.568667875 seconds.
Target directory: ./TOI_1130
Bulk Loading System Observations#
For convenience, some methods have been added that download all tags available for a specific type of measurement, e.g.
def SystemDownloader.download_time_series(output_path=, unzip=True):
tags = self.spectroscopy.tags
if output_path is None:
output_path = os.path.join(self.target_dir, 'time_series')
return self.download(tags=tags, output_path=output_path, unzip=unzip)
download_imaging
and download_spectroscopy
behave analogously.
Here the resulting file structure will default to the following, in order to facilitate the simultaneous analysis of different types of measurements.
TOI_1130/ # target_dir, a subdirectory of the provided `data_dir`, named using `system.name`
├── time_series/
│ ├── 5734/
│ ├── 5903/
│ ├── ...
├── zip/
│ ├── 5734.zip
│ ├── 5903.zip
│ ├── ...
Conclusion#
In this notebook, we’ve demonstrated how to log in to ExoFOP, select from the available observations for a given system, and bulk download time series observation files. You can now explore and analyze the downloaded data further.
Continue with the next tutorial, to learn how to extract and standardise the measurement files of all downloaded observations.
Appendix#
Note on credential files#
This package accepts .txt
files and .yaml
files as credential files.
These files should adhere to the following structures:
|
|
---|---|
username=your_username |
username: your_username |
In terms of security, it’s important to note that credential files containing your username and password pose a higher risk than entering your password when prompted. After all, a malicious actor with access to your computer could potentially locate these files, getting your full ExoFOP credentials. On the contrary, cookies have a limited lifespan, typically expiring after a certain period, which, as of the writing of this tutorial, is 7 days for ExoFOP.
If you do not want your cookies to be stored on your machine for later usage,
you can remove all cookies at the end of a session, by running authenticator.delete_cookies()
.
TL;DR: For most purposes, entering a password once a week should represent the right balance between convenience and security.