Hi,
In one of our project with @Mataivic, we generate rather huge dataset collections (3 x 10000 files) easily. We had some performance issues with the 17.05 when the form rendering (to populate the drop list I guess). It could take several minutes to display a tool form which required some fasta as input (we had 30000 of them in the history)
The timing with this 18.01 was perfect because the project was stuck.
But but but, it's worse with the 18.01. I sure or at least I hope that it's because I did a mistake during the migration.
Even with less than 6000 files in the history, Galaxy give up after 59s when displaying the tool form.
A red box appears eventually
Cannot connect to Galaxy
Galaxy is currently unreachable. Please try again in a few minutes. Please contact a Galaxy administrator if the problem persists.
With the module Web developer in Chrome, we get in network:
#Name: build?version=2.0.2&__identifer=5f1cbjxk366&tool_version=2.0.2
#Status: (failed)
#Type: xhr
#Initiator: jquery.js:9175
#Size: 0B
#Time: 1.0 min
We also reach this "timeout" when we want to display the job within the admin interface or when we want to get a shared history.
We (myself, @mhoebekesbr and @pbordron) don't get so much info in either the uwsgi or the nginx logs. But maybe, we don't watch at the good place.
We tried to skip NGINX, and we get the same result
uwsgi:
http: :8080
It's a rather small dev instance hosted in a VM:
uwsgi:
http: :8080
processes: 4
threads: 4
offload-threads: 1
static-map: /static/style=static/style/blue
static-map: /static=static
master: true
virtualenv: .venv
pythonpath: lib
module: galaxy.webapps.galaxy.buildapp:uwsgi_app()
die-on-term: true
hook-master-start: unix_signal:2 gracefully_kill_them_all
hook-master-start: unix_signal:15 gracefully_kill_them_all
py-call-osafterfork: true
enable-threads: true
mule: lib/galaxy/main.py
farm: job-handlers:1
[program:galaxy_uwsgi]
command = /w/galaxy/galaxydev/galaxy/.venv/bin/uwsgi --yaml /w/galaxy/galaxydev/galaxy/config/galaxy.yml --logto /tmp/uwsgi_logto.log
directory = /w/galaxy/galaxydev/galaxy/
umask = 022
autostart = true
autorestart = true
startsecs = 10
user = galaxydev
environment = PATH="/w/galaxy/galaxydev/galaxy/.venv/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",PYTHONPATH="/w/galaxy/galaxydev/galaxy/lib",HOME="/w/galaxy/galaxydev/galaxy/",USER="galaxydev",DRMAA_LIBRARY_PATH=/opt/sge/lib/lx24-amd64/libdrmaa.so.1.0
;numprocs = 1
stopsignal = INT
log_stdout = true
log_stderr = true
loglevel = blather
logfile = /tmp/supervisord_galaxy_uwsgi.log
upstream galaxy {
server localhost:8080;
}
server {
listen 80 default_server;
listen [::]:80 default_server;
server_name _;
client_max_body_size 10G; # max upload size that can be handled by POST requests through nginx
# use a variable for convenience
set $galaxy_root /w/galaxy/dev/galaxy;
location / {
proxy_pass http://galaxy;
# pour debug
proxy_read_timeout 300;
proxy_send_timeout 300;
proxy_connect_timeout 300;
# end pour debug
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
# serve framework static content
location /static {
alias $galaxy_root/static;
expires 24h;
}
location /static/style {
alias $galaxy_root/static/style/blue;
expires 24h;
}
location /static/scripts {
alias $galaxy_root/static/scripts;
expires 24h;
}
location /robots.txt {
alias $galaxy_root/static/robots.txt;
expires 24h;
}
location /favicon.ico {
alias $galaxy_root/static/favicon.ico;
expires 24h;
}
# serve visualization and interactive environment plugin static content
location ~ ^/plugins/(?<plug_type>.+?)/(?<vis_name>.+?)/static/(?<static_file>.*?)$ {
alias $galaxy_root/config/plugins/$plug_type/$vis_name/static/$static_file;
expires 24;
}
location /_x_accel_redirect/ {
internal;
alias /;
}
We can provide you some extra information if those above aren't enough.
Many thanks but advance.
You weren't using uwsgi prior to this right - with 17.05?
Do you have histories with a large number of items visible or are they mostly hidden under collections in the history panel?
I'll spent a long time optimizing the precursor to the build endpoint but that was orders of magnitude before this 😅. I should try again with this newer endpoint and with larger histories.
I'm finding some exciting low hanging fruit that might really help - I did a bunch of other database optimization stuff recently so I'm maybe better at this than I was in the past. If I give y'all a patch against 18.01 any chance you can you test it for me?
I surely can help testing a patch, thanks!
@nsoranzo Out of curiosity, does your use case have a mix of lists and nested lists (list:list or list:paired) or is the problematic history just flat lists?
Just lists, for a total of around 50,000 datasets among 6 lists.
@nsoranzo https://github.com/galaxyproject/galaxy/pull/5977 should help a great deal for this usage pattern ... I'm not sure we have tests for the kinds of things that might break with though. I'd love some feedback on this if you are willing though - these results are often counter-intuitive.
Some notes I've been taken based on profiling that API endpoint are here. https://gist.github.com/jmchilton/d68565662f7f4b7ee2640f09fbb92962.
In addition to just raw timings of things in that branch - it'd be extra bonus cool to know how each commit affects the timing as well as having sql_debug log of the queries.
These logs can be generated by applying https://github.com/galaxyproject/galaxy/pull/5539 to your instance and hitting the build endpoint for the tool form that the browser does with a sql_debug=1 in the query parameter and then excavating them from your Galaxy web logs. Obviously collecting all that different data is a large project - even just before and after timings on the open PR would be super helpful and anything on top of that is just bonus.
You weren't using uwsgi prior to this right - with 17.05?
We already used uwgsi with the 17.05.
I didn't really catch the real difference between the way Galaxy deal with uwsgi in the 17.05 and the 18.01. I have really little knowledge in admin stuff.
Do you have histories with a large number of items visible or are they mostly hidden under collections in the history panel?
Mostly hidden under collections in the history panel
I'm finding some exciting low hanging fruit that might really help - I did a bunch of other database optimization stuff recently so I'm maybe better at this than I was in the past. If I give y'all a patch against 18.01 any chance you can you test it for me?
When you want! This instance is currently dedicated to this project and can crash for some good purpose 😁
I really don't know if they are some constructive clues but:
a list instead of a list of 810 items I will test your patch tomorrow. Should I just have to jump to this branch jmchilton:1801_db_opt?
Anyway many thanks for your interest and your quick response.
It is interesting and concerning that different users see different performance. Is it possible that one of you is in admin_users and one of you is not? If there are security checks and such skipped for an admin that might explain things?
New branch I'm thinking will be a bit better is jmchilton:1801_tool_state_opt_2 - for the new PR https://github.com/galaxyproject/galaxy/pull/5983.
I am admin and not my colleagues.
I can check easily this hypothesis with a couple of other rats :)
Le mer. 25 avr. 2018 à 20:41, John Chilton notifications@github.com a
écrit :
It is interesting and concerning that different users see different
performance. Is it possible that one of you is in admin_users and one of
you is not? If there are security checks and such skipped for an admin that
might explain things?New branch I'm thinking will be a bit better is
jmchilton:1801_tool_state_opt_2 - for the new PR #5983
https://github.com/galaxyproject/galaxy/pull/5983.—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/galaxyproject/galaxy/issues/5975#issuecomment-384392139,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKjyFvCEH85YyIEcLspwkRp7lAm8zItyks5tsMM8gaJpZM4TjFdF
.
So I spent a long time today working https://github.com/galaxyproject/galaxy/issues/5987 - my initial testing shows it can speed up things for tool forms over that don't use dynamic filters (so that should be most tools forms will see the improvement I think). I'll try to polish it up and open yet another new pull request in the coming days.
And here it is - https://github.com/galaxyproject/galaxy/pull/5997. It is much more performant in my tests - let me know though.
✨ Magic @jmchilton ✨
The forms now display almost immediately even with 45,000 datasets
You once again save my life project 🍻
Can not agree more with @lecorguille!
Hum ... @jmchilton can you do the same magic on the tool
Collection Operations > newFilter failed datasets from a list
(The other tools of Collection Operations feel not better)
Many many thanks
@lecorguille Oh crap, good catch - I've pushed a bug fix into https://github.com/galaxyproject/galaxy/pull/5997 that fixes the performance for Failed Failed and other tools without any data input parameters.
@jmchilton Sorry, but I can't see any improvements
After 59s
Cannot connect to Galaxy
Galaxy is currently unreachable. Please try again in a few minutes. Please contact a Galaxy administrator if the problem persists.
Uncaught error.
Ugh - the problem with filter failed is conditional on the order of large vs. small collections in your history 😑. https://github.com/galaxyproject/galaxy/pull/6046 should however "fix" it, any chance I can get you to test it @lecorguille?
I will be happy to test that but on Friday (@ home today)
Many thanks for your celerity
Le mer. 2 mai 2018 à 21:19, John Chilton notifications@github.com a
écrit :
Ugh - the problem with filter failed is conditional on the order of large
vs. small collections in your history 😑. #6046
https://github.com/galaxyproject/galaxy/pull/6046 should however "fix"
it, any chance I can get you to test it @lecorguille
https://github.com/lecorguille?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/galaxyproject/galaxy/issues/5975#issuecomment-386090663,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKjyFitDYtEBbGoKiwa0UmmH7Zg0XN_Mks5tugbZgaJpZM4TjFdF
.
It's a Bird...It's a Plane...It's Superman... It's @jmchilton
@lecorguille can this be closed?
We still have some timeouts sometimes but we need to search a little on our
side.
The different PR definitly improve the UI :+1
I can reopen this issue later if needed.
Le ven. 18 mai 2018 à 20:46, Björn Grüning notifications@github.com a
écrit :
@lecorguille https://github.com/lecorguille can this be closed?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/galaxyproject/galaxy/issues/5975#issuecomment-390298137,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKjyFk5U-biAxtktGS-w0NmXiQaw_cf9ks5tzxb4gaJpZM4TjFdF
.
Most helpful comment
✨ Magic @jmchilton ✨
The forms now display almost immediately even with 45,000 datasets
You once again save my
lifeproject 🍻