Download tags from ExoFOP#

This notebook demonstrate how to bulk download time series observations from ExoFOP and then organize the downloaded data.

Setup#

We’ll start by importing the necessary modules.

import numpy as np

import exofop
from exofop.download import ExoFOPAuthenticator, System, SystemDownloader

# For nice logs, we reformat the root loggger's StreamHandler
from exofop.utils.logger import configure_root_logger
configure_root_logger()

Login to ExoFOP#

Accessing data on ExoFOP does not necessarily require logging in, but having valid credentials can be beneficial as certain types of data uploaded to ExoFOP are eligible for a proprietary period of 12 months within the tfopwg group. After 12 months, the data become publicly available unless the uploader decides to extend the proprietary period.

ⓘ If you do not have an account, set authenticator = None and continue this tutorial at Download Observations.

The ExoFOPAuthenticator class#

As one naturally is uncomfortable entering credentials, let’s have a look at how it works. ExoFOPAuthenticator handles authentication by interacting with the ExoFOP website, storing and managing session cookies, and providing methods to log in, if the session cookies have expired. During initialisation, the class checks if valid cookies are available. If they are not, login is attempted using one of the following methods

  • provide the username during initialisation and prompt the password during login

  • use the username and password provided from a credential file

The password is never stored in the object, merely the cookies generated upon successful login.

So let us initialise an authenticator providing a username. Here we have explicitly handed cookies_dir its default value, to draw attention to the fact that your cookies will be stored in your local installation of the package unless you specify otherwise. If this installation is shared with other users, you might consider storing your cookies elsewhere, by changing the provided directory path.

authenticator = ExoFOPAuthenticator(
    username='lovelace',
    cookies_dir=exofop.COOKIES_DIR,  # default
    credential_file_path=None,
    number_of_cookie_jars=3,
)
INFO:exofop.download.authenticator:Cookies will expire in 2 days, 2:46:22.418284

Next, let us log in to our ExoFOP account.

If there are no valid cookies for this user name in the specified directory, you will be prompted to enter your password – unless you have initialised the object with a credential file in which you specify your username and password located at credential_file_path (see Note in Appendix).

success = authenticator.login()
print(f'Login success: {success}')

print(authenticator)
INFO:exofop.download.authenticator:Cookies are still valid. No need to login.
Login success: True
ExoFOPAuthenticator(username=lovelace, number_of_cookie_jars=3, remaining_validity=2 days, 2:46:22.394646)

Download Observations#

In the following section, we will download multiple time series observations of a single stellar system.

Identify a system#

Individual systems are identified in this package using the `System’ class, which helps to handle both TIC and TOI identifiers that may be used to refer to the system.

The class can be initialised by passing a system name, TIC ID and/or TOI ID. A useful method provided by this class is autocomplete(), which completes the missing ID (TIC or TOI) by performing a lookup in a table loaded by ExoFOP.

# Initialise a system
system = System(name='TOI_1130', tic='254113311', toi=None)

print(f"Target name: {system.target}")
print(system)

# Alternative ways to initialise a system
# As the first argument is name, we could also hand IDs using the prefixes 'TIC' or 'TOI'.
# system = System(name="TOI-1130")  # Indicate TOI ID with prefix 'TOI'
# system = System("TIC_254113311")  # Indicate TIC ID with prefix 'TIC'

# Less recommended is to give an id without prefix, as it requires an extra lookup
# System(name="254113311")
Target name: 254113311
System(name=TOI_1130, tic=TIC_254113311 toi=None)

Download data from a system#

Next we initialise a SystemDownloader, which is the main object for downloading data from ExoFOP that is associated to a single target. The downloader works asynchronously, that means multiple requests can be done simultaneously, shortening the download speed.

data_dir = '.'

system_loader = SystemDownloader(
    system=system,
    data_dir=data_dir,
    authenticator=authenticator,  # optional
    max_concurrent_downloads=5,
)

Lets have a look at the time series data on ExoFOP for this target.

tab = system_loader.time_series.table
tab[:5]
Table length=5
TIC IDTICTOITelescopeCameraFilterPixel ScalePSFPhot Aperture RadObs DateObs DurationNum ObsObs TypeTransit CoverageDelta MagUserGroupTagNotes
int64int64str11str24str24str10float64float64float64str10float64int64str10str14float64str9str6int64str52
254113311--TOI 1130.02PEST (0.305 m)ST-8XMERc1.234.487.02020-03-30230.0184ContinuousFull8.3tantfopwg18493--
254113311--TOI 1130.02LCO-SSO-1.0m (1.0 m)Sinistro, 1.0mzs0.3891.89.02020-03-30177.0167ContinuousFull9.8stockdaletfopwg18598--
254113311--TOI 1130.02LCO-SSO (1.0 m)Sinistrozs0.392.4413.02020-05-14205.0192ContinuousFull7.36schwarztfopwg19121--
254113311--TOI 1130.02LCO-SSO-1m0 (1.0 m)SINISTROzs0.3892.16.02020-05-14249.0235ContinuousFull7.3contitfopwg19188--
254113311--TOI 1130.02LCO-SAAO-1m0 (1.0 m)SINISTROzs0.3891.8115.02020-06-07272.0260ContinuousFull--collinstfopwg19442nominal .02 with overlapping .01

Note that astropy.table.Table can be easily converted to pd.DataFrames using tab.to_pandas().

Besides the time_series data, we can load the following overview tables:

  • system_loader.imaging.table

  • system_loader.spectroscopy.table

  • system_loader.light_curve.table

  • system_loader.stellar_parameters.table

  • system_loader.nearby_target.table

We can get the tags given in the tables of any of the above attributes by replacing table with tags, for example:

tags = np.sort(system_loader.time_series.tags)
tags
array([  5734,   5903,   6038,   6038,   6384,   6384,  18493,  18598,
        19121,  19188,  19261,  19319,  19425,  19429,  19435,  19441,
        19442,  19442,  19914,  28680,  28680,  29207,  29208,  29262,
        33584,  33677,  96267,  96799,  97594,  97874,  97963,  97967,
        97973,  98598,  98692,  98844,  98852,  98881,  99828, 107605,
       424251, 428772, 428867, 428875, 429211, 429363, 430115, 431063,
       431370, 431371, 431527, 431654, 432164, 432682, 432703, 433095,
       433104, 433655, 437156])

Next, let us downlaod the first two of these tags using the download method. This will download the .zip files to system_loader.zip_dir, which lies inside of system_loader.target_dir. If unzip=True, the .zip files will be unpacked to output_dir, which defaults to target_dir. Therefore, the following lines of code should result in

data_dir/  # as given to system_loader

├──TOI_1130/  # target_dir, named using `system.name`   ├── 5734/
│   ├── 5903/
│   ├── zip/
│      ├── 5734.zip
│      ├── 5903.zip

As we are running this code from a Jupyter notebook which always has an open event loop, we need to use await before calling the download method.

⚠️ If we were to call this method in a script instead, we would need to omit `await`.
await system_loader.download(tags[:2], output_dir=None, unzip=True)  # type: ignore
print(f"Target directory: {system_loader.target_dir}")
INFO:httpx:HTTP Request: GET https://exofop.ipac.caltech.edu/tess/download_tag_files_zip.php?tag=5903 "HTTP/1.1 200 OK"
INFO:exofop.download:file_name='5903.zip': OK
ERROR:exofop.download:Request error: 
WARNING:exofop.download:Retrying download (1/2) for https://exofop.ipac.caltech.edu/tess/download_tag_files_zip.php?tag=5734
ERROR:exofop.download:Request error: 
INFO:exofop.download:file_name='5734.zip': TIMEOUT
INFO:exofop.download.downloaders:Downloaded 1 of 2 files (50.0 %) successfully.
INFO:exofop.download.downloaders:Successfully downloaded 1 files in 11.568667875 seconds.
Target directory: ./TOI_1130

Bulk Loading System Observations#

For convenience, some methods have been added that download all tags available for a specific type of measurement, e.g.

def SystemDownloader.download_time_series(output_path=, unzip=True):
    tags = self.spectroscopy.tags
    if output_path is None:
        output_path = os.path.join(self.target_dir, 'time_series')
    return self.download(tags=tags, output_path=output_path, unzip=unzip)

download_imaging and download_spectroscopy behave analogously.

Here the resulting file structure will default to the following, in order to facilitate the simultaneous analysis of different types of measurements.

TOI_1130/  # target_dir, a subdirectory of the provided `data_dir`, named using `system.name`
├── time_series/
│   ├── 5734/
│   ├── 5903/
│   ├── ...
├── zip/
│   ├── 5734.zip
│   ├── 5903.zip
│   ├── ...

Conclusion#

In this notebook, we’ve demonstrated how to log in to ExoFOP, select from the available observations for a given system, and bulk download time series observation files. You can now explore and analyze the downloaded data further.

Continue with the next tutorial, to learn how to extract and standardise the measurement files of all downloaded observations.

Appendix#

Note on credential files#

This package accepts .txt files and .yaml files as credential files.

These files should adhere to the following structures:

.txt file

.yaml file

username=your_username
password=your_password

username: your_username
password: your_password

In terms of security, it’s important to note that credential files containing your username and password pose a higher risk than entering your password when prompted. After all, a malicious actor with access to your computer could potentially locate these files, getting your full ExoFOP credentials. On the contrary, cookies have a limited lifespan, typically expiring after a certain period, which, as of the writing of this tutorial, is 7 days for ExoFOP.

If you do not want your cookies to be stored on your machine for later usage, you can remove all cookies at the end of a session, by running authenticator.delete_cookies().

TL;DR: For most purposes, entering a password once a week should represent the right balance between convenience and security.