Joss-reviews: [REVIEW]: EMBERS: Experimental Measurement of BEam Responses with Satellites

Created on 1 Sep 2020  ·  71Comments  ·  Source: openjournals/joss-reviews

Submitting author: @amanchokshi (Aman Chokshi)
Repository: https://github.com/amanchokshi/EMBERS
Version: v0.7.0
Editor: @mbobra
Reviewers: @teuben, @mbobra
Archive: Pending

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/5d0d8c8aa9cfa96790cf4109d1fc00c7"><img src="https://joss.theoj.org/papers/5d0d8c8aa9cfa96790cf4109d1fc00c7/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/5d0d8c8aa9cfa96790cf4109d1fc00c7/status.svg)](https://joss.theoj.org/papers/5d0d8c8aa9cfa96790cf4109d1fc00c7)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@teuben & @mbobra, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @mbobra know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @teuben

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@amanchokshi) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

Review checklist for @mbobra

Conflict of interest

  • [x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

General checks

  • [x] Repository: Is the source code for this software available at the repository url?
  • [x] License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • [x] Contribution and authorship: Has the submitting author (@amanchokshi) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
  • [x] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

  • [x] Installation: Does installation proceed as outlined in the documentation?
  • [x] Functionality: Have the functional claims of the software been confirmed?
  • [x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • [x] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • [x] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • [x] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
  • [x] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • [x] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
  • [x] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • [x] State of the field: Do the authors describe how this software compares to other commonly-used packages?
  • [x] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
  • [x] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
Python TeX review

Most helpful comment

I've been adding some comments in the issue tracker of the repo, author is working on those. I'm about halfway through the examples, a bit more work than I thought, will take a night sleep on it. more tomorrow.

All 71 comments

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @teuben, @hasantahir it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1017/pasa.2018.30 is OK

MISSING DOIs

- None

INVALID DOIs

- None

👋 @teuben @hasantahir Thank you so much for agreeing to review this submission! You'll see your reviewer checklists above. If you'd like more information on any of these items, please reference the review critera. Also, we encourage open communication between the reviewers, submitting author, and editor. So please feel free to ask questions here on this review thread. I'm happy to help!

/ooo September 14 until September 18

@teuben @hasantahir How is the review going? Do you need any help? Please let me know, I'm happy to help!

@teuben Please respond and let me know if you are still able to review.

@whedon remove @hasantahir as reviewer

OK, @hasantahir is no longer a reviewer

@whedon add @mbobra as reviewer

OK, @mbobra is now a reviewer

👋 @amanchokshi I am going through the checklist and your package looks really nice. It has been an easy review so far. I have a couple comments:

Functionality

  • I am trying to follow the Embers by Example guide and am stuck on the second code block. Is test_tool installed with embers? When I try to run the command python -m test_tool, I get the error: No module named test_tool.

Documentation

  • I am a bit confused about the statement of need in the documentation, since it reads slightly differently than the JOSS paper (I am not an expert in radio astronomy).

From the online documentation:

EMBERS is a python package designed to enable polarised measurements of radio telescope antenna beam-patterns using satellites.

From the JOSS paper:

EMBERS is a python package which provides a modular framework for radio telescopes and interferometric arrays such as the MWA, HERA, and the upcoming SKA to accurately measure the all sky beam responses of their antennas using weather and communication satellites.

Could you perhaps clarify and expand the statement of need in the documentation? Also, the documentation doesn't mention the target audience for this software (e.g. astronomers, systems engineers?). Could you please include this?

Software Paper

  • The statement of need includes this sentence: "Chokshi et al (in prep) represents the first complete large scale implementation of a satellite based beam measurement system." Could you please write one or two sentences about what has been done in the past, why this is an improvement, and how this software compares to other commonly-used packages?
  • Could you please write a sentence to the explain the gray (unobserved) areas in the caption of Figure 1?
  • We prefer citations with a DOI. The paper includes a couple references to Chokshi et al., in prep, which isn't yet peer-reviewed and doesn't yet have a DOI. Would it be possible to replace this with another reference or update it with a DOI?

@mbobra Thanks Monica. I've addressed most of your comments.

Functionality

In Embers by Example, the second and third code blocks are meant for people unfamiliar with python. test_tool is an example package which doesn't exist, but is used to demonstrate that the tools can be run using either python -m test_tool or simply test_tool. The following code block show that help can be accessed for each tool using the --help flag. I've made the preceding text clearer by saying that this is an example tool.

Documentation

I've updated the statement of need in the online docs to match the JOSS paper and have included the target audience.

Paper

Added that all previous experiments were validations of the methodology, and this was the first large scale test. There was no previously available software, suitable for the large data volume. I've also update the caption of Fig. 1 with much more detail on what all the plots mean, including the grey regions.

Unfortunately, Chokshi et al., in prep is under collaboration review right now, and will probably take a couple of months before it's published. The other paper which is relevant is Line et al., which I've cited. It's about one of the previous validation experiments and is not as relevant as it deals with much smaller scale and does not use the EMBERS package.

@whedon generate pdf

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

@amanchokshi Thanks for your updates, I really appreciate it. A couple more comments:

Examples
I'm working through the examples and am still a bit confused.

One example in the Embers by Example docs says:

To get a quick preview of the raw RF data, we create waterfall plots. The following code creates a waterfall plot of sample data provided with EMBERS

$ waterfall_single
--------------------------------------------------
No input data provided, using packaged sample data
>>> waterfall_single --help, for more options
--------------------------------------------------
Waterfall plot saved to ./embers_out/rf_tools/S06XX_2019-10-10-02:30.png

But this doesn't work exactly as written for me; I have to do a

from embers.rf_tools import rf_data
rf_data.single_waterfall

But then I'm stuck, because I don't see any example rf_tiles to use with rf_data.single_waterfall().

Would it be possible to create a couple examples that a user can execute from start to finish? You have a really nice example gallery, so by no means am I asking to re-do all of it, but I think users may be confused about how exactly to get started. Some sample data may be a nice way to quickly show users how they can use the package.

References

Unfortunately, Chokshi et al., in prep is under collaboration review right now, and will probably take a couple of months before it's published. The other paper which is relevant is Line et al., which I've cited. It's about one of the previous validation experiments and is not as relevant as it deals with much smaller scale and does not use the EMBERS package.

I totally understand; does the journal allow you to put Chokshi et al. on a preprint service, such as arXiv? Readers need to have access to the references, so Chokshi et al. must be accessible or replaced with a reference to something else. If you presented the work as a poster or talk at a conference before, then you can put your poster or slides on Zenodo and get a DOI for it and reference that instead. Let me know if you want help with that.

@mbobra thanks Monica, it seems like you've caught a bug in my code. I created a new virtual env and installed EMBERS following my examples. I had the same issue. Apparently my sample data is not being installed with pip. I will try and get it to work or modify my instructions.

For now, what works:

git clone https://github.com/amanchokshi/EMBERS.git
cd EMBERS
pip install --editable .

waterfall_single

References

I will talk to my supervisor and see when I can upload a draft of my paper to arXiv. By the end of this review process I may be able to update the paper with an arXiv doi. If now, I can definitely upload a conference talk to zenodo and cite that.

@mbobra I've fixed the bug. If you update embers to the latest version 0.7.3, the examples should now work as shown in the docs

@amanchokshi I'm sorry -- I'm probably missing something simple. I'm looking at the Embers by Example guide, which says:

$ mkdir ~/EMBERS
$ cd ~/EMBERS

So I did that. Then I skipped the stuff about the test tool. Then the next code block says:

waterfall_single

But I'm still confused 😟 Where are the sample data? How do I access these data? What import statements do I need to make before I can run waterfall_single? Are there any example notebooks I can look at?

I was looking at a recently published JOSS paper called PyCS3 (which I picked at random). They provide a tutorial and example notebooks, which users can download and run. I don't want to be overly prescriptive -- so PYCS3's approach is by no means the only way to provide examples, but I personally find it easy to understand. Perhaps this can be fixed by a few explanatory sentences if I am missing something simple.

References

I will talk to my supervisor and see when I can upload a draft of my paper to arXiv. By the end of this review process I may be able to update the paper with an arXiv doi. If now, I can definitely upload a conference talk to zenodo and cite that.

Yes another possibility (if you want to this JOSS paper and your other your peer-reviewed paper to act as companion papers) is to pause publication of this JOSS paper until the other one is accepted for publication. Then you can cite the DOI for that paper in this one and vice-versa.

@mbobra

waterfall_single is a cli tool which makes it easy to interact with the underlying single_waterfall function.

Internally, what waterfall_single does:

import pkg_resources
from embers.rf_tools.rf_data import single_waterfall

# If no data provided, use package sample data
rf_file = pkg_resources.resource_filename("embers.kindle", "data/rf_data/S06XX/2019-10-10/S06XX_2019-10-10-02:30.txt")
out_dir  = "embers_out/rf_tools"

single_waterfall(rf_file, out_dir)

In Embers by Example I have added a link which takes you to the api docs of the relevant function. If you think this is sufficient, I can all similar links to all the examples.

Alternately, I could create a notebook, with accompanying example data.

@amanchokshi

Ah, this is awesome! 🚀 Yes I do think having one or two simple notebooks, exactly like the code block above, will really help folks get started.

I checked off the functionality box above; after the notebooks, I'll check off the examples. Then all we have to do is sort out the references and we're done!

@mbobra Thanks Monica! I'll let you know when I've completed the notebooks. It may take a couple of days.

Hi @mbobra, there have been quite significant updates to the Embers by Example section of the documentation.

Can you begin by updating to version 0.7.4 and deleting all the old files you may have created?

The instructions begin with downloading one day of sample data from a new data repository, which all the subsequent examples use. Each section now has instruction on how to use the cli-tool, with an equivalent example code-block below. I've also linked the functions use which will take you to the relevant section of the EMBERS API.

Certain examples will take a little while to run and I expect that you can run all the examples in less than 3 hours. It's important to run them in the sequence presented, because data products created early on are used at later stages.

Hope things work this time and are much more clear 😁

Just a suggestion on this test though:   is there no way you could make
this test run in a few minutes?  If a cosmological simulation takes 3
hours, you can either take fewer steps or a smaller grid, but it should
functionally give some answer (that can be compared to something given
as baseline result).  I'm not too happy that I would need to wait 3
hours on a test.

On 10/26/20 12:20 PM, Aman Chokshi wrote:
>

Hi @mbobra https://github.com/mbobra, there have been quite
significant updates to the Embers by Example
https://embers.readthedocs.io/en/latest/embersbyexample.html section
of the documentation.

Can you begin by updating to version 0.7.4 and deleting all the old
files you may have created?

The instructions begin with downloading one day of sample data from a
new data repository, which all the subsequent examples use. Each
section now has instruction on how to use the cli-tool, with an
equivalent example code-block below. I've also linked the functions
use which will take you to the relevant section of the EMBERS API
https://embers.readthedocs.io/en/latest/api.html.

Certain examples will take a little while to run and I expect that you
can run all the examples in less than 3 hours. It's important to run
them in the sequence presented, because data products created early on
are used at later stages.

Hope things work this time and are much more clear 😁


You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
https://github.com/openjournals/joss-reviews/issues/2629#issuecomment-716660063,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAZ4MGOEXOJQZCQZGVOCCM3SMWOVXANCNFSM4QSAJ3TA.

@teuben I agree that it would be nice if the tests run faster. This is not a requirement for passing review, though (unless the software makes performance claims). I advise opening an issue on the repo about the runtime. And I'm happy to co-sign that. But this is not a blocking item.

@amanchokshi Thank you for all this hard work. I'm happy to recommend Embers for publication pending an accessible reference to Chokshi et al., in prep.

Hi @teuben, the tests for embers are automated, and take ~20 minutes on Travis.

The examples shown in Embers by Example do take longer, but are meant to be a guide to show new users how to begin interacting with Embers. I've reduces the amount of sample data as much as possible (only 1 day of data, instead of 6 months), but some of the later functions are statistical in nature and require multiple satellite passes along with their corresponding data files to accurately calibrate the instrument.

Thanks @mbobra, I've talked to my supervisor regarding the reference to Chokshi et al., in prep, and we've decided to remove the reference from the JOSS paper. The other reference, Line et al. outlines the premise of the experimental setup, and provide sufficient scientific background for this work.

I've removed Chokshi et al., in prep, from the paper and have reworded things a bit. Hope that's okay.

@whedon generate pdf

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

I've been adding some comments in the issue tracker of the repo, author is working on those. I'm about halfway through the examples, a bit more work than I thought, will take a night sleep on it. more tomorrow.

@teuben Thanks a lot Peter!

@amanchokshi As I'm running through the examples, you mentioned there is a quick test, presumably via pytest.ini and the tests/ directory. I don't see the example command line in the README or somewhere how to run this, so "beginners" will be baffled. Can you add that somewhere (and remind me here as well, as I don't remember)

@amanchokshi apart from some minor things I know you are looking into, my last hurdle on the checkmark list is performance.

There are no good comments on performance that I could find (in fact, in the Example there should be some warnings in more places, but you are working on the aggressive CPU usage already). I have not looked at the code, but let me try to explain how I would lke to try and work towards an answer why this is so slow. We have N files, with each file M lines. Yet, the example you mentioned took 40 core hours. Can this be understood in perhaps an O(N) analysis? Are there no loops in python? Is everything done in good numpy notation,without looping?
There is of course the potential speedup using XLA/numba/dask, which is mentioned in one of the issues, but before resorting to this, would be nice if we could understand in simple terms why it takes as long as it does.

I've added instructions to run the automated tests at the bottom of the examples page - Testing EMBERS. It should take no more than 2 minutes to run the tests. To run the test you need to clone the directory and install a few additional dependancies:

git clone https://github.com/amanchokshi/EMBERS.git
cd EMBERS

# Setup a virtual enviroment
python -m venv embers-env
source embers-env/bin/activate

pip install .
pip install -r requirements_dev.txt

# Run automated tests
pytest

@teuben I've been playing around with align_batch and the --max_cores option to try and get a handle of the parallel performance. On my laptop, I've tried running all the sample data with between 1 and 4 cpu cores. I used the GNU time tool to gauge the performance.

The system cpu usage vs number of cpu cores is as expected:
system_cpu

The time vs number of cpu cores seems to fall off at higher number of cores. This is probably partially due to running things on my laptop which has a limited resources free.
time_cpu

Below are the raw terminal outputs which I used to generate the two figure

$ time align_batch --out_dir=tmp --max_cores=1
>>> Aligned files saved to: tmp
>>> align_batch --out_dir=tmp --max_cores=1  879.14s user 47.24s system 104% cpu 14:45.86 total

$ time align_batch --out_dir=tmp --max_cores=2
>>> Aligned files saved to: tmp
>>> align_batch --out_dir=tmp --max_cores=2  999.20s user 72.56s system 212% cpu 8:23.51 total

$ time align_batch --out_dir=tmp --max_cores=3
>>> Aligned files saved to: tmp
>>> align_batch --out_dir=tmp --max_cores=3  1012.36s user 129.45s system 309% cpu 6:08.94 total

$ time align_batch --out_dir=tmp --max_cores=4
>>> Aligned files saved to: tmp
>>> align_batch --out_dir=tmp --max_cores=4  1143.28s user 137.17s system 401% cpu 5:18.70 total

In the code, I have a for loop which selects pairs of data files, which will be temporally aligned, smoothed using a savgol filter and interpolated to a lower frequency. Beyond the initial for loop to select the files, the interpolation and filtering is achieved using scipy acting on numpy arrays.

I don't really know that much about code optimization, but I think that numpy and scipy may already be optimized to grab available cores. If that's the case, it could explain the drop off we see in time vs number of cores.

The above tests used the 1440 data files available in the sample data repository. The complete data set includes 202,000 files with ~17,000 lines of data each, amounting to almost 300 GB of data.

This is a very useful analysis to see how it scales with the processor! Nicely done. Theoretically for ideal parallel behavior the user and system time should not depend on max_cores (NP), and cpu (elapsed) goes as 1/NP. So, although we do see that the % (user+system)/elapsed) scales quite nicely, you can see the user time goes from 879" to 1143", a 30% "loss". You can see the system time went from 47" to 137", a factor of 3, far worse. This could point to I/O contention, more than CPU s clashing. Modern i7 laptop cpus are pretty good in general, but terrible when it comes to dealing with parallel tasks when 100% of the threads are used. You didn't try more than 4, I'm guessing there are 4 cores and each with 2 threads, the most common situation these days, so CPU wise you didn't get to the worse point. But the system times scales much worse. In myexperience running between 4 and 8 actually doesn't improve the elapsed time, in fact, more than often it reverses and goes slower.

There is still the lingering question if we can understand a single core performance of 15 minutes. Each txt file in tiles_data has about 17k lines. I only see one day (2019-10-10) in that tree, and yes, there are thus 1440 files. So in other words, this is like a N=1440 x M=17000 matrix, or 25M points. The O(N) analysis means that I would like to understand how this problem scales with M and N. And then looking inside, there are only ~25M points. What needs to be done that takes 15 minutes., is that something that makes sense? Another way to think is at a 6GFlop machine, it amounts to ~200k floating point operations per point. This is the point I don't understand about EMBERS, and I think it would add value to the paper (or README somewhere) to say something about it.

I've added instructions to run the automated tests at the bottom of the examples page - Testing EMBERS. It should take no more than 2 minutes to run the tests. To run the test you need to clone the directory and install a few additional dependancies:

git clone https://github.com/amanchokshi/EMBERS.git
cd EMBERS

# Setup a virtual enviroment
python -m venv embers-env
source embers-env/bin/activate

pip install .
pip install -r requirements_dev.txt

# Run automated tests
pytest

this ran fine for me. I had two failures (and a few deprecation warnings):

  tests/test_mwa_utils/test_mwa_pointings.py::test_clean_meta_json_start FAILED                                                                                                                                      
  tests/test_tile_maps/test_ref_fee_healpix.py::test_create_model_last FAILED

The depreciation warnings come from a package which I use - mwa-pb, and has required me to freeze some of the dependancies in the requirement.txt. I'm not sure why those two tests are failing. I cloned a fresh copy of EMBERS and tried the tests again and all of them seem to pass. Could it be a python version thing? I'm stumped

Screen Shot 2020-11-18 at 1 13 39 pm
Screen Shot 2020-11-18 at 1 13 06 pm

The automated tests on travis are all passing too. Check out Latest Test and All Past Tests

Regarding the performance, I've re-run align_batch using only a single core, with a varying number of files. The probes your question of how the problem scales with N. The results seem to indicate a linear relationship between number of files and processing time.

files_time

I'm quite out of my depth with this kind of code testing, but let me try and clarify exactly what the code does. Each data file has ~17k lines, which each contain 112 floating point numbers corresponding to the 112 frequency channels available. The 17k lines correspond to 17k timesteps at which power was samples. A pair of two such files are selected.

  • The two arrays are interpolated down to a lower frequency (1Hz default). This reduces the array to (1800, 112)
  • A two stage savgol filter is applied along the time axis (1800 points). This is a 1D filter along the longer axis of the array.
  • The two interpolated and smoothed arrays are saved to disk

I've been looking into the performance as a function of the number of lines in the files. I did this by modifying embers.rf_tools.align_data.savgol_interp, to take a fraction and only read the first fraction of lines in the files. The align_batch function is built around the savgol_interp function by paralelizing it and writing the outputs to disk.

I repeated the test for each fraction I used 10 times, and used the median value as an estimate of time. If we increase the number of loops, I expect the plot to stabilise further.
lines_time

My inference of the above plot is that almost half of the processing time for each set of files comes from reading in the data, beyond which the time increases linearly with number of lines.

Below is the code which I used to generate this analysis:

import math
from time import time

import numpy as np
from embers.rf_tools.rf_data import read_data
from matplotlib import pyplot as plt
from scipy import interpolate
from scipy.signal import savgol_filter


def savgol_interp(
    ref,
    tile,
    savgol_window_1=None,
    savgol_window_2=None,
    polyorder=None,
    interp_type=None,
    interp_freq=None,
    fraction=1,
):

    """Interpolate a power array followed by savgol smoothing.
    """

    # Read time and power arrays from data files
    ref_power, ref_time = read_data(ref)
    tile_power, tile_time = read_data(tile)

    ref_length = ref_time.shape[0]
    tile_length = tile_time.shape[0]

    # Number of lines in different files will differ due to varied
    # recording rates of different recievers
    #  print(f"Number of lines in rf0XX file: {ref_length}")
    #  print(f"Number of lines in S06XX file: {tile_length}")

    # Select the first [Fraction] of number of lines
    ref_len_new = int(round(fraction * ref_length))
    tile_len_new = int(round(fraction * tile_length))

    ref_power = ref_power[:ref_len_new, ::]
    ref_time = ref_time[:ref_len_new]

    tile_power = tile_power[:tile_len_new, ::]
    tile_time = tile_time[:tile_len_new]

    # Round up/down to nearest integer of time
    start_time = math.ceil(max(ref_time[0], tile_time[0]))
    stop_time = math.floor(min(ref_time[-1], tile_time[-1]))

    # Array of times at which to evaluate the interpolated data
    time_array = np.arange(start_time, stop_time, (1 / interp_freq))

    # Mathematical interpolation functions
    f = interpolate.interp1d(ref_time, ref_power, axis=0, kind=interp_type)
    g = interpolate.interp1d(tile_time, tile_power, axis=0, kind=interp_type)

    # New power array, evaluated at the desired frequency
    ref_ali = f(time_array)
    tile_ali = g(time_array)

    # Savgol level 1. Capture nulls / small scale structure
    ref_ali = savgol_filter(ref_ali, savgol_window_1, polyorder, axis=0)
    tile_ali = savgol_filter(tile_ali, savgol_window_1, polyorder, axis=0)

    # Savgol level 2. Smooth noise
    ref_ali = savgol_filter(ref_ali, savgol_window_2, polyorder, axis=0)
    tile_ali = savgol_filter(tile_ali, savgol_window_2, polyorder, axis=0)

    return (ref_ali, tile_ali, time_array, ref_power, tile_power, ref_time, tile_time)


if __name__ == "__main__":

    ref = "tiles_data/rf0XX/2019-10-10/rf0XX_2019-10-10-00:00.txt"
    tile = "tiles_data/S06XX/2019-10-10/S06XX_2019-10-10-00:00.txt"

    fractions = np.linspace(0.02, 1, 50)

    med_time = []
    for f in fractions:

        # Repeat measurements 10 times and take median
        time_array = []
        for i in range(10):
            start = time()
            si_out = savgol_interp(
                ref,
                tile,
                savgol_window_1=11,
                savgol_window_2=15,
                polyorder=2,
                interp_type="cubic",
                interp_freq=1,
                fraction=f,
            )
            end = time()
            time_array.append(end - start)

        med_time.append(np.median(time_array))

    fig = plt.figure(figsize=(6, 4))
    plt.style.use("seaborn")
    plt.plot(fractions, med_time, linewidth=2.1, marker="o", color="maroon")
    plt.xlabel("Fraction of Number of lines")
    plt.ylabel("Median Time [seconds]")
    plt.tight_layout()
    plt.savefig("lines_time.png")

The depreciation warnings come from a package which I use - mwa-pb, and has required me to freeze some of the dependancies in the requirement.txt. I'm not sure why those two tests are failing. I cloned a fresh copy of EMBERS and tried the tests again and all of them seem to pass. Could it be a python version thing? I'm stumped

Screen Shot 2020-11-18 at 1 13 39 pm
Screen Shot 2020-11-18 at 1 13 06 pm

The automated tests on travis are all passing too. Check out Latest Test and All Past Tests

I'm stumped too, i'm going to ignore this for the review, as the examples seem to run (ok , i do have one failure still, the ephem_chrono task).

Hi Peter, I'll try and summarise what I've found in my performance tests:

  • O(N) - Performance scales linearly with the number of files, on a single core
  • O(M) - Performance scales linearly with the number of lines of data, with a significant overhead reading in the data. By my estimate, more than half the time processing a single file of ~17k lines can be attributed to reading in the file. This is ~0.27 seconds per file. The processing itself is done with scipy interpolation + filtering, which scales linearly with array size

thanks Aman, this is indeed very useful stuff to understand.  50% in the
I/O is quite a lot.  These are 2kb files, that should read in 0.02ms on
a typical 100MB/s disk. So a significant time it does something else.
Maybe the decoding?  Even the readlines() routine isn't able to perform
at formal speed. I measured indeed about 250ms for reading one file.
When not processing any line, so returning nothing, it still look 5-10
ms. I might play with this, I didn't see any obvious ways to speed this up.

On 11/18/20 8:02 PM, Aman Chokshi wrote:
>

Hi Peter, I'll try and summarise what I've found in my performance tests:

  • O(N) - Performance scales linearly with the number of files, on a
    single core
  • O(M) - Performance scales linearly with the number of lines of
    data, with a significant overhead reading in the data. By my
    estimate, more than half the time processing a single file of ~17k
    lines can be attributed to reading in the file. This is ~0.27
    seconds per file. The processing itself is done with |scipy|
    interpolation + filtering, which scales linearly with array size


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openjournals/joss-reviews/issues/2629#issuecomment-730054795,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAZ4MGONTDIFMWS7OYUI55LSQRVBJANCNFSM4QSAJ3TA.

thanks Aman, this is indeed very useful stuff to understand. 50% in the I/O is quite a lot. These are 2kb files, that should read in 0.02ms on a typical 100MB/s disk. So a significant time it does something else. Maybe the decoding? Even the readlines() routine isn't able to perform at formal speed. I measured indeed about 250ms for reading one file. When not processing any line, so returning nothing, it still look 5-10 ms. I might play with this, I didn't see any obvious ways to speed this up.

oops, i messed up my units. The files are 2MB, so that should be around 10ms, which corresponds to my just processing the lines, no decoding. So that decoding is expensive, more like 200 ms, 20 times as expensive as reading the line.

To close up the review, and assuming you can add a few words on the performance, the only puzzle remaining is why I could not get ephem_chrono to work. It seems a few other tools after that also fail for me, possibly because of this? I also had a case where leaving out the --max_cores=4 would cause it to fail with the error "ValueError: max_workers must be greater than 0"

I think I know where the error in --max_cores comes from. I'll test it out and ping you. The ephem_chrono problem is a puzzle, but I'll try it again once I've finished the max_core issue

The batch scripts should now work after you update EMBERS to the latest version - 0.8.2. I had added a bad default value to the max_cores argparse option.

I ran ephem_chrono again and things worked. Do you have a folder called TLE in the sat_utils directory? This should have been copied over when you cloned the sample data and moved the sample data to the correct dirs.

Before running ephem_batch:

tree -d sat_utils
sat_utils
└── TLE

After running ephem_batch:

tree -d sat_utils
sat_utils
├── TLE
├── ephem_data
└── ephem_plots

The above data is required for ephem_chrono to work. Finally, after ephem_chrono runs successfully:

tree -d sat_utils
sat_utils
├── TLE
├── ephem_chrono
├── ephem_data
└── ephem_plots

Can you delete all the data in the sat_utils directory, except the TLE folder, then run ephem_batch followed by ephem_chrono?

I've created a new Performance page in the documentation. This outlines our investigation into parallel and single core performance and states the I/O limitations. I've also added a link to the performance page in the examples.

I've removed all directories (except TLE) in embers_out/sat_utils, the TLE directory has the following 73 files:

21576.txt 25113.txt 25117.txt 25159.txt 25415.txt 25419.txt 25477.txt 25481.txt 25982.txt 25986.txt 33062.txt 40069.txt 40089.txt 41180.txt 41184.txt 41188.txt 44020.txt 44025.txt 44387.txt
23545.txt 25114.txt 25118.txt 25338.txt 25416.txt 25420.txt 25478.txt 25482.txt 25983.txt 28654.txt 33063.txt 40086.txt 40090.txt 41181.txt 41185.txt 41189.txt 44022.txt 44026.txt
23546.txt 25115.txt 25119.txt 25413.txt 25417.txt 25475.txt 25479.txt 25980.txt 25984.txt 33060.txt 33064.txt 40087.txt 40091.txt 41182.txt 41186.txt 44018.txt 44023.txt 44027.txt
25112.txt 25116.txt 25158.txt 25414.txt 25418.txt 25476.txt 25480.txt 25981.txt 25985.txt 33061.txt 33065.txt 40088.txt 41179.txt 41183.txt 41187.txt 44019.txt 44024.txt 44028.txt

I then ran ephem_batch --max_cores=1 and only noticed some warnings
sat_ephemeris.py:286: RuntimeWarning: More than 20 figures have been opened.
Figures created through the pyplot interface (matplotlib.pyplot.figure) are retained until explicitly closed and may consume too much memory.
Running ephem_chrono immediately returns (though it has the coffee message)

$ ephem_chrono
Saving chronological Ephem files to: ./embers_out/sat_utils/ephem_chrono
Grab a coffee, this may take more than a couple of minutes!

It does create all the json files, but they just contain [], and the log file has 0 lines.

I am doing this in a clean directory, where i only see embers_out and tiles_data. The code is somewhere else, which I installed using "pip install -e ." I am using a "venv", which is a directory inside that EMBERS source tree.

This is bizarre. Let try a couple of long shots. Can you check whether the files in the TLE directory contain anything? Perhaps they've been corrupted somehow?

You should see something like this:

$ tail embers_out/sat_utils/TLE/21576.txt

1 21576U 91050C   19282.72964883  .00000014  00000-0  18223-4 0  9990
2 21576  98.6203 331.5014 0002898 326.0593  34.0410 14.41572163482935
1 21576U 91050C   19283.90959667 +.00000017 +00000-0 +19271-4 0  9992
2 21576 098.6201 332.6919 0002839 321.6987 038.4000 14.41572293482883

Have you been running all the other examples in the same way? With EMBERS installed using pip install -e . in the source tree?

I've attempted to replicate the way you're running things. I have my source directory at ~/Repos/EMBERS within which I've created a virtual environment and run pip install --editable . If I do a pip list when this virtual environment is active, I see that embers version 0.8.2 is installed form source

Package                       Version     Location
----------------------------- ----------- -----------------------------------
embers                        0.8.2       /Users/amanchokshi/Repos/EMBERS/src

I've then created a new directory ~/Repos/em_test/ which just contains tiles_data and embers_out/sat_utils/TLE

Keeping the original virtual env active, I run the following

$ ephem_batch
$ ephem_chrono

Checking the contents of the first file in the ephem_chrono dir:

$ tail embers_out/sat_utils/ephem_chrono/2019-10-10-00:00.json

            1.5377585726597032,
            1.5366094819324243,
            1.5354642595022225,
            1.534322886123595,
            1.5331853425510376,
            1.5320516095390462,
            1.5309216678421183
        ]
    }
]% 

Have I deviated from your setup in any way?

PS: It does usually take a while to process the data (~10 mins), if we were working with a larger data set. The sample data only contains a couple of satellite passes each. I'll remove the the coffee message in the next EMBERS version.

So many matplotlib figures should not be opened by ephem_batch, instead figures should be saved to embers_out/sat_utils/ephem_plots. Does ephem_batch create this directory and .png figures for you?

Can you update EMBERS to V0.8.3, or git pull and build from source again? I've explicitly closed the figures after the plot is saved. This shouldn't be necessary, but may help.

Also have removed the coffee message, which was unnecessary

Indeed my embers_out/sat_utils/TLE/21576.txt looks like yours. I've been able to run all steps before this fatal one just fine, all the figure are like yours.

You have also done what I did , with the pip installing it with source in place. I use a pretty standard anaconda3 as the base for my python, not the system python. I've actually reinstalled that one.

I will retrace the steps, but a question before I do this, can i reinstall and just do ephem_batch, it doesn't need the lengthy align_batch, does it ?

A suggestion on your side might be to have some kind of -v flag for all your scripts, which makes it verbose, to perhaps help to see some expected intermediate output, helping you to see what's going on here. In my own system I have debug=LEVEL, which will create more and more output the higher the level is. Other packages have -vvv instead of -v to increase the verbosity level :[) Even for myself I find it useful, as I forget my own code, and a debug level could help with that.

So many matplotlib figures should not be opened by ephem_batch, instead figures should be saved to embers_out/sat_utils/ephem_plots. Does ephem_batch create this directory and .png figures for you?

Can you update EMBERS to V0.8.3, or git pull and build from source again? I've explicitly closed the figures after the plot is saved. This shouldn't be necessary, but may help.

Also have removed the coffee message, which was unnecessary

I noted that 3 files in TLE have 3 length:44019.txt 44020.txt 44025.txt - the rest have anywhere from 20 to 78 lines. There are 73 files in here.

I do see 70 plots in the ephem_plots directory. (not 73, since 3 of my TLE's were 0 length)

I've updated code again. Ran ephem_chrone, but still no json files.

You don't need to run align_batch again. The only thing that any of the ephem scripts should need are the TLE files. It's also okay that 3 of the TLE files are empty, they have very sparse satellite coverage and on that data had np passes visible from the site.

I've been using pyenv to install python versions on my system. Currently I've used pyenv to set by default version to 3.8.3. I used pyenv virtualenv to create my virtual enviroments. I don't think this should be an issue.

Are you on a python version >= 3.6? I have used f-strings, do 3.5 will have problems

Just a thought, is the directory with tiles_data and embers_out/sat_utils/TLE on your external VFAT formatted drive? The names of the json files created have a colon : in them. Could that be tripping things up?

You don't need to run align_batch again. The only thing that any of the ephem scripts should need are the TLE files. It's also okay that 3 of the TLE files are empty, they have very sparse satellite coverage and on that data had np passes visible from the site.

I've been using pyenv to install python versions on my system. Currently I've used pyenv to set by default version to 3.8.3. I used pyenv virtualenv to create my virtual enviroments. I don't think this should be an issue.

Are you on a python version >= 3.6? I have used f-strings, do 3.5 will have problems

python 3.8.5
the whole tree with embers_out and tiles_data is on a regular ext4 formatted drive.
The json files look fine in their name, e.g.
-rw-rw-r-- 1 teuben teuben 2 Nov 19 09:00 embers_out/sat_utils/ephem_chrono/2019-10-10-00:00.json

Lets rule out problems in ephem_batch so that we can focus on ephem_chrono. Can you try and see if the log seems similar?

$ tail embers_out/sat_utils/ephem_data/ephem_batch.log

INFO: ephem_batch: Saved sky coverage plot of satellite [25481] to ./embers_out/sat_utilsephem_plots/25481.png
Saved ephemeris of satellite [25481] to ./embers_out/sat_utilsephem_data/25481.npz
INFO: ephem_batch: Saved sky coverage plot of satellite [23546] to ./embers_out/sat_utilsephem_plots/23546.png
Saved ephemeris of satellite [23546] to ./embers_out/sat_utilsephem_data/23546.npz
INFO: ephem_batch: Saved sky coverage plot of satellite [25479] to ./embers_out/sat_utilsephem_plots/25479.png
Saved ephemeris of satellite [25479] to ./embers_out/sat_utilsephem_data/25479.npz
INFO: ephem_batch: Saved sky coverage plot of satellite [23545] to ./embers_out/sat_utilsephem_plots/23545.png
Saved ephemeris of satellite [23545] to ./embers_out/sat_utilsephem_data/23545.npz
INFO: ephem_batch: Saved sky coverage plot of satellite [25478] to ./embers_out/sat_utilsephem_plots/25478.png
Saved ephemeris of satellite [25478] to ./embers_out/sat_utilsephem_data/25478.npz

Also that the first plot embers_out/sat_utils/ephem_plots/21576.png looks like this:

21576

And finally that the size of the ephem_data directory matches mine:

$ du -sh embers_out/sat_utils/ephem_data

>>> 2.2M    embers_out/sat_utils/ephem_data

log file looks ok, but has a different order of the files downloaded. 
File has 143 lines (73 for files, 70 for successfull ones)

the plot looks similar, though mine has 41 passes, yours has 5.

my du -sh embers_out/sat_utils/ephem_data
17M     embers_out/sat_utils/ephem_data

e.g. the one you plotted, where I had 41 passes:

-rw-rw-r-- 1 teuben teuben 146855 Nov 19 08:26 21576.npz

It has 70 files (3 missing due to 0 length TLE files)

On 11/19/20 9:36 AM, Aman Chokshi wrote:
>

Lets rule out problems in |ephem_batch| so that we can focus on
|ephem_chrono|. Can you try and see if the log seems similar?

|$ tail embers_out/sat_utils/ephem_data/ephem_batch.log INFO:
ephem_batch: Saved sky coverage plot of satellite [25481] to
./embers_out/sat_utilsephem_plots/25481.png Saved ephemeris of
satellite [25481] to ./embers_out/sat_utilsephem_data/25481.npz INFO:
ephem_batch: Saved sky coverage plot of satellite [23546] to
./embers_out/sat_utilsephem_plots/23546.png Saved ephemeris of
satellite [23546] to ./embers_out/sat_utilsephem_data/23546.npz INFO:
ephem_batch: Saved sky coverage plot of satellite [25479] to
./embers_out/sat_utilsephem_plots/25479.png Saved ephemeris of
satellite [25479] to ./embers_out/sat_utilsephem_data/25479.npz INFO:
ephem_batch: Saved sky coverage plot of satellite [23545] to
./embers_out/sat_utilsephem_plots/23545.png Saved ephemeris of
satellite [23545] to ./embers_out/sat_utilsephem_data/23545.npz INFO:
ephem_batch: Saved sky coverage plot of satellite [25478] to
./embers_out/sat_utilsephem_plots/25478.png Saved ephemeris of
satellite [25478] to ./embers_out/sat_utilsephem_data/25478.npz |

Also that the first plot |embers_out/sat_utils/ephem_plots/21576.png|
looks like this:

21576
https://user-images.githubusercontent.com/40157844/99679853-881b8080-2ad0-11eb-889c-1b4341201360.png

And finally that the size of the |ephem_data| directory matches mine:

|$ du -sh embers_out/sat_utils/ephem_data >>> 2.2M
embers_out/sat_utils/ephem_data |


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openjournals/joss-reviews/issues/2629#issuecomment-730415928,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAZ4MGNROWLMCID7EYTE4S3SQUUNXANCNFSM4QSAJ3TA.

It's really strange that you have 41 satellite passes. Can you try this

$ ephem_single --sat=21576 --out_dir=tmp

$ tree tmp

tmp
├── ephem_data
│   └── 21576.npz
└── ephem_plots
    └── 21576.png

The .npz data file should be only 21K in size. If the plot has more than 5 satellites, it means that something is wrong with the TLE files. This would explain why ephem_chrono is making empty files

It's a long shot, but can you check if the contents of the TLE file is the same as mine:

$ more embers_out/sat_utils/TLE/21576.txt

1 21576U 91050C   19282.72964883  .00000014  00000-0  18223-4 0  9990
2 21576  98.6203 331.5014 0002898 326.0593  34.0410 14.41572163482935
1 21576U 91050C   19283.90959667 +.00000017 +00000-0 +19271-4 0  9992
2 21576 098.6201 332.6919 0002839 321.6987 038.4000 14.41572293482883

ls -l tmp/ephem_/
-rw-rw-r-- 1 teuben teuben 146855 Nov 19 10:18 tmp/ephem_data/21576.npz
-rw-rw-r-- 1 teuben teuben 171585 Nov 19 10:18 tmp/ephem_plots/21576.png

so much bigger.   The png shows 41 passes.

On 11/19/20 10:07 AM, Aman Chokshi wrote:
>

It's really strange that you have 41 satellite passes. Can you try this

|$ ephem_single --sat=21576 --out_dir=tmp $ tree tmp tmp ├──
ephem_data │   └── 21576.npz └── ephem_plots └── 21576.png |

The |.npz| data file should be only |21K| in size. If the plot has
more than 5 satellites, it means that something is wrong with the TLE
files. This would explain why |ephem_chrono| is making empty files


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openjournals/joss-reviews/issues/2629#issuecomment-730436600,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAZ4MGLXEJYJN6DEX5NKB7LSQUYDRANCNFSM4QSAJ3TA.

mine is clearly longer

more embers_out/sat_utils/TLE/21576.txt
1 21576U 91050C   19274.12296826  .00000022  00000-0  20647-4 0  9992
2 21576  98.6215 322.8165 0003370 355.9403   4.1757 14.41571172481698
1 21576U 91050C   19274.74764681 +.00000022 +00000-0 +20705-4 0  9994
2 21576 098.6214 323.4469 0003330 354.0023 006.1125 14.41571283481565
1 21576U 91050C   19275.64996028  .00000018  00000-0 19615-4 0  9999
2 21576  98.6213 324.3575 0003275 351.1025   9.0105 14.41571404481914
1 21576U 91050C   19276.69109109  .00000011  00000-0 17383-4 0  9994
2 21576  98.6211 325.4081 0003216 347.5805  12.5303 14.41571520482061
1 21576U 91050C   19277.73222188  .00000003  00000-0 14745-4 0  9993
2 21576  98.6209 326.4587 0003154 343.9674  16.1414 14.41571600482214
1 21576U 91050C   19278.70394390 -.00000000 +00000-0 +13856-4 0  9996
2 21576 098.6208 327.4393 0003103 340.6734 019.4337 14.41571694482138
1 21576U 91050C   19279.12039615  .00000001  00000-0 14345-4 0  9993
2 21576  98.6207 327.8595 0003082 339.1436  20.9627 14.41571750482416
1 21576U 91050C   19279.74507450 +.00000006 +00000-0 +15778-4 0  9994
2 21576 098.6207 328.4898 0003052 336.8952 023.2099 14.41571851482289
1 21576U 91050C   19280.36975290  .00000009  00000-0 16766-4 0  9996
2 21576  98.6206 329.1202 0003016 334.7424  25.3617 14.41571918482593
1 21576U 91050C   19280.71679643 +.00000012 +00000-0 +17538-4 0  9990
2 21576 098.6206 329.4703 0002999 333.5040 026.5995 14.41571968482421
1 21576U 91050C   19281.68851827  .00000014  00000-0 18394-4 0  9996
2 21576  98.6205 330.4508 0002950 329.8875  30.2144 14.41572084482783
1 21576U 91050C   19282.72964883  .00000014  00000-0 18223-4 0  9990
2 21576  98.6203 331.5014 0002898 326.0593  34.0410 14.41572163482935

On 11/19/20 10:11 AM, Aman Chokshi wrote:
>

It's a long shot, but can you check if the contents of the TLE file is
the same as mine:

|$ more embers_out/sat_utils/TLE/21576.txt 1 21576U 91050C
19282.72964883 .00000014 00000-0 18223-4 0 9990 2 21576 98.6203
331.5014 0002898 326.0593 34.0410 14.41572163482935 1 21576U 91050C
19283.90959667 +.00000017 +00000-0 +19271-4 0 9992 2 21576 098.6201
332.6919 0002839 321.6987 038.4000 14.41572293482883 |


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openjournals/joss-reviews/issues/2629#issuecomment-730439126,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAZ4MGL4ZM7JLITBB3OK6FLSQUYRPANCNFSM4QSAJ3TA.

The data you have doesn't match what's on github currently. I initially had too much data on the sample data repo and had to reduce it to make the examples run in a reasonable time.

Can you pull down the sample data repo again, or just do download the TLE dir from the sample data repo?

The new TLE files should be sufficient to get ephem_chrono to work.

Can you check whether all the files in tiles_data have the time-stamp 2019-10-10 in them? If they do, you have the correct data and the wrong TLE files. If they don't you'll unfortunately have do pull the whole sample data repo again.

Sorry for the hassle. Hopefully it's smooth sailing from here 🤞

I redid the data repo instructions. I now have a short embers_out/sat_utils/TLE/21576.txt file, with 4 lines.
The tiles_data are all (1440 I see) from 2019-10-10

I'm afraid, even with a new align_batch and ephem_batch I still get a blank ephem_chrono.

I think I'm going to redo the whole workflow.

Maybe you could make a bash script with just the important stuff and let it run?

#!/bin/bash

align_batch --max_cores=1
ephem_batch --max_cores=1
ephem_chrono

Although I had done exactly those steps, now it's producing non-zero json files. I see 48 json files, of which the first 16 have a good size. Until 2019-10-10-07:30 The ones from 2019-10-10-08:00.json to 2019-10-10-23:30.json are blanked.

/ooo November 23 until November 30

If we can fix this last item, we're done with the review!

Was this page helpful?
0 / 5 - 0 ratings