mirror of
https://github.com/asciinema/asciinema.git
synced 2026-04-25 07:55:51 +03:00
[GH-ISSUE #213] Bad request for uploads of >4kb recordings under CentOS (Python 3.4) #154
Labels
No labels
bug
compatibility
feature request
fit for beginners
help wanted
hosting
idea
improvement
packaging
pull-request
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
starred/asciinema#154
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @andyone on GitHub (Jun 7, 2017).
Original GitHub issue: https://github.com/asciinema/asciinema/issues/213
Bug report
System info:
Steps to reproduce:
asciinema upload asciicast.jsonExpected behavior:
File uploaded to asciinema.org
Actual behavior:
Client print error message:
Additional info:
Client create broken recording if zsh (
4.3.11 (x86_64-redhat-linux-gnu)in my case) is used and oh-my-zsh is installed. If oh-my-zsh disabled or bash used as a shell, client create and upload recording without any problems.Recording JSON: https://gist.github.com/andyone/b2a883e8c3795a6ad393a715ff7a41df
@ThiefMaster commented on GitHub (Jun 7, 2017):
Happens for me too. Using ZSH but not OMZ.
tmpw6byrbv8-asciinema.json
@andyone commented on GitHub (Jun 7, 2017):
I found that if I change API url from HTTPS to HTTP all works fine.
@ku1ik commented on GitHub (Jun 8, 2017):
I've changed load balancer configuration yesterday so this may be related.
@ku1ik commented on GitHub (Jun 8, 2017):
I was able to reproduce this in Centos 7 Vagrant VM. I think this has something to do with Brightbox load balancer (with SSL termination, automatic Let's Encrypt certificate) which we use since yesterday.
@ku1ik commented on GitHub (Jun 8, 2017):
@andyone @ThiefMaster can you try now? I may have solved it.
@ThiefMaster commented on GitHub (Jun 8, 2017):
still getting a 400
@andyone commented on GitHub (Jun 8, 2017):
I think it is OpenSSL related issue. Sending data with curl is ok because curl uses NSS (Network Security Services) for working with SSL/TLS.
It is nginx based solution?
@ku1ik commented on GitHub (Jun 8, 2017):
@andyone I think Brightbox load balancer uses Haproxy.
@ku1ik commented on GitHub (Jun 8, 2017):
I can consistently reproduce this. I created Vagrantfile and instructions: https://github.com/sickill/bb-lb-400
@ku1ik commented on GitHub (Jun 8, 2017):
@andyone the problem doesn't seem to be this specific line in your recording, but the overall size of the uploaded json file.
@andyone commented on GitHub (Jun 8, 2017):
I created a proxy https://ascii.kaos.io based on webkaos (it's improved nginx with BoringSSL) with this config. My and @ThiefMaster recordings uploaded successfully over this proxy.
@ku1ik commented on GitHub (Jun 8, 2017):
Here's what I know so far:
HTTP requests go fine through Brightbox load balancer, but HTTPS ones give 400 Bad Request
for request where the request body is larger than about 4KB.
Interesting thing is we're getting 400 for HTTPS under CentOS. HTTPS under macOS works fine. (HTTP works fine everywhere).
I looked deeper, tried to find out where's the difference. I used tcpdump to see the requests on both CentOS and macOS (HTTP, assumed the request itself is formatted the same as under HTTPS).
The only difference seems to be 2 empty lines before body on macOS, 1 empty line on CentOS (probably due to slightly differen version of urllib that comes with Python 3 on these OSes):
CentOS:
macOS:
To see how it affects things I temporarily changed "Request buffer size" on LB from 4096 (default) to 8192 (max) and it suddenly started working fine everywhere (all OSes, HTTPS), yay!
I'm not super confident this is the ultimate solution because with buffer size of 4096 this is true:
HTTPS on macOS
HTTPS on CentOS
When I bump "request buffer size" to 8192 the body size and protocol
doesn't matter and all works fine. I wonder though whether by bumping
it to 8192 I'm only buying time (make less people affected) or this
solves the problem completely (if so then why?).
I contacted Brightbox about this, hopefully they can explain what's going on.
@ku1ik commented on GitHub (Jun 8, 2017):
Update re 8192 buffer size on Brightbox side: with this number it works for me under CentOS but still doesn't work for @ThiefMaster .
@andyone commented on GitHub (Jun 8, 2017):
Ops, sorry.
@ku1ik commented on GitHub (Jun 8, 2017):
Before I put the traffic through Brightbox LB I terminated SSL in Nginx and everything was working fine for years. If it works with @andyone's proxy based on Nginx then it may suggest Nginx is more "forgiving" about request formatting, while Haproxy is more strict, and asciinema client formats the request incorrectly (for Haproxy standards) under Python 3.4 (and its urllib, which is older than the 3.6.1 I use on mac).
@andyone commented on GitHub (Jun 8, 2017):
I can check it later with Haproxy, but my version is built with LibreSSL instead of OpenSSL.
@ku1ik commented on GitHub (Jun 8, 2017):
My current theory is this:
This single new line before headers and body is not enough for LB to finish reading headers (it expects 2 new lines), and it keeps reading all data below it as headers, counting bytes, which eventually exceed max size for headers. If LB has some variable like
bytes_read(bytes read from socket), it checks its value after finishing reading headers, and then later again after reading body. If you upload <4kb file then it never crosses 4kb limit for headers, and if you upload >4kb it exceeds it.(and this only happens under HTTPS)
No idea if that's the case, just thinking out loud 😀
@ku1ik commented on GitHub (Jun 8, 2017):
Updated source code so it adds extra new line, checked under CentOS and still fails. So the above theory is wrong.
@ku1ik commented on GitHub (Jun 8, 2017):
This works under CentOS with HTTPS:
So maybe SSL lib used by Python is different than curl and the problem lies somewhere in SSL-land?
@andyone commented on GitHub (Jun 8, 2017):
I think so. Python uses OpenSSL, curl uses NSS.
@ku1ik commented on GitHub (Jun 8, 2017):
@andyone the certificate for ascii.kaos.io is not Let's Encrypt?
@andyone commented on GitHub (Jun 8, 2017):
RapidSSL SHA256withRSA
@ku1ik commented on GitHub (Jun 8, 2017):
Normally I would say CentOS is missing root certificate for Let's Encrypt (or something like that 😊 ), but the SSL connection is being made and the error is on HTTP protocol level (400 Bad Request) so ... 👐
@andyone commented on GitHub (Jun 8, 2017):
If root certificate for Let's Encrypt is missing it will not work even with curl.
@johnl commented on GitHub (Jun 8, 2017):
Our (Brightbox) load balancer does indeed use haproxy. The HTTP RFC and the haproxy docs do state that one CRLF is required to separate the headers from the body:
https://github.com/haproxy/haproxy/blob/master/doc/internals/http-parsing.txt
Is it possible that you're only sending a CR or a LF here, rather than a full CRLF?
@andyone commented on GitHub (Jun 8, 2017):
@sickill This is proxy on HA-Proxy 1.7.5 with LibreSSL 2.5.0 - https://ascii-ha.kaos.io. My and @ThiefMaster recordings, and
over-4k.jsonfrom your repository uploaded successfully over this proxy.@ku1ik commented on GitHub (Jun 9, 2017):
@andyone ok. So, can you change
tune.bufsize(https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#3.2-tune.bufsize) to 4096?@ku1ik commented on GitHub (Jun 9, 2017):
@johnl I checked for CRLF and all is OK here.
I tcpdumped the request on both CentOS and macOS again (over HTTP, again, assuming the HTTP payload is the same for HTTPS).
dump-centos.pcap.txt and dump-mac.pcap.txt contain tcpdump capture (
tcpdump -s 0 dst port 80 -w dump-centos.pcap.txt).dump-centos-hex.txt and dump-mac-hex.txt contain hex formatted dumps (via
hexdump -C).dump-centos-hex.txt
dump-centos.pcap.txt
dump-mac-hex.txt
dump-mac.pcap.txt
It seems on both OSes there's CRLF used for new lines, and there's one blank line between headers and body.
@ku1ik commented on GitHub (Jun 9, 2017):
On the left CentOS, on the right macOS:
@andyone commented on GitHub (Jun 9, 2017):
@sickill Config updated.
over-4k.jsonuploaded as well.@ku1ik commented on GitHub (Jun 9, 2017):
@andyone thanks for the update. It seems it doesn't add
X-Forwarded-Protoheader (because the returned recording URL ishttp://). Can you addhttp-request set-header X-Forwarded-Proto https if { ssl_fc }?@andyone commented on GitHub (Jun 9, 2017):
This is my config:
Where should I add this line?
@ku1ik commented on GitHub (Jun 9, 2017):
@andyone I think it needs to go into
backendsection (I'm not haproxy expert though).@ku1ik commented on GitHub (Jun 9, 2017):
@andyone btw, I REALLY appreciate you helping debug this 😍 Thanks!
@johnl commented on GitHub (Jun 9, 2017):
don't forget forward-for too. This should replicate the setup pretty closely, with the ssl ciphers too:
@andyone commented on GitHub (Jun 9, 2017):
I modified config to this, but with no luck:
Client still return links with
http://.I'm always happy to help improve the useful services 😉.
@andyone commented on GitHub (Jun 9, 2017):
@johnl This is full config, all required options is set in
defaultsandglobalsections:@ku1ik commented on GitHub (Jun 9, 2017):
If @andyone's haproxy config is now very close to BB and we still can't reproduce the issue, does it make sense to try with Let's Encrypt cert? This is one of the differences between https://ascii-ha.kaos.io and https://asciinema.org.
@andyone commented on GitHub (Jun 9, 2017):
No. BB LB can be built with OpenSSL (I use LibreSSL).
I will try to add Let's Encrypt certificate for https://ascii-ha.kaos.io.
@andyone commented on GitHub (Jun 9, 2017):
Done - https://ascii.kaos.re
HA-Proxy 1.7.5 (w/ LibreSSL 2.5.0) + Let's Encrypt certificate (created by Certbot)
Config:
@andyone commented on GitHub (Jun 9, 2017):
Looks like all works fine.
over-4k.jsonuploaded successfully.@ku1ik commented on GitHub (Jun 12, 2017):
I have no further ideas for this. I'm considering rolling back to my own Nginx instance for load balancing and SSL termination 🤕
@johnl commented on GitHub (Jun 12, 2017):
I'm trying to whittle this down to a single curl command that can reproduce the problem, but haven't managed it yet, can anyone help?
I'm POSTing a 5k body, with an authentication username/password using curl. I'm hitting a Brightbox load balancer with a netcat web server backend, so I can see the raw request text. It always goes through - can't make it trigger a bad request response.
If this is being rejected by the load balancer, I should not need a real instance of the app on the backend, as it should never get that far - so we should be able to reproduce this with curl and no app.
I've tried curl on ubuntu and centos7, and with openssl specifically (note you can specify the --engine command to curl to choose which sslib lib to use. centos7 curl binaries are built against the most options)
@ku1ik commented on GitHub (Jun 12, 2017):
@johnl thanks for looking into this.
Makes sense to use netcat as the backend for testing 👍
curl equivalent for
asciinema upload over-4k.jsonis more or less this:(replace
uuid4with the result ofpython3 -c 'import uuid; print(uuid.uuid4())')And it works with curl indeed...
I compared tcpdump of
asciinema uploadand the above curl and there isn't anything on HTTP protocol level that looks suspicious to me. However, some tcp frames show up in different locations (maybe more/less data is sent/fits in each tcp packet).@ku1ik commented on GitHub (Jun 12, 2017):
I captured HTTP request (to http://asciinema.org) with tcpflow in CentOS 7 VM:
Then in another shell (in the same VM) ran:
I cut off the response from it, leaving only request. Here's what gets sent, byte by byte: tcpflow-req.txt
I replayed this captured HTTP request against asciinema.org:80 with
nc:All good.
Now, I've sent over SSL to asciinema.org:443:
Here's the result:
/cc @johnl
@andyone commented on GitHub (Jun 12, 2017):
@sickill Can you check same request with https://ascii.kaos.re?
@ku1ik commented on GitHub (Jun 12, 2017):
@andyone just checked. Did this
(cat tcpflow-req.txt; cat) | openssl s_client -connect ascii.kaos.re:443- uploaded successfully.@johnl commented on GitHub (Jun 12, 2017):
I've done more digging here. curl on centos7 uses nss but wget uses openssl. I can successfully send the request with either curl or wget. I can even send using the python httpie tool (under python 3).
but it fails sending it to openssl s_client via stdin
but it succeeds sending it to openssl s_client by pasting the request into it, rather than using stdin!
I'm now pretty sure this is because something is sending requests with LF line endings rather than the required CRLF line endings, but I'm not sure quite what. I think "openssl s_client" is a bad testing tool and is making it difficult to be sure what is going on.
But I've yet to reproduce this with a proper http client, whether using nss or openssl (curl on ubuntu uses openssl and works fine too, so double confirmed that). Anyone else manage that?
@benaryorg commented on GitHub (Jun 20, 2017):
I've just done some testing on my own and can confirm that this problem persists with a content-length of 4520, not however with the same request stripped by 1000 characters (
Content-Lengthadjusted according to the changes made).The CRLF are present in all my tests and
xxdconfirms that they are sent over the pipe.I could also test with OpenBSD's
nc(which supports TLS).From the documentation:
As opposed to nginx which does not keep the whole request in memory but passes it on on the fly (AFAIK) or at the very least, buffers it into a temporary file.
There is the
no option http-buffer-requestOption, which, if I got that right disables exactly that behaviour (written foroption http-buffer-request, withoutno):@peterbrittain commented on GitHub (Jul 7, 2017):
I've just hit this too. It strikes me that with your testing of the same content working over HTTP but not HTTPS, it's unlikely to be the buffer sizes at fault, unless something between your client and the proxy is adding a lot of extra headers.
But maybe there is a bug in whatever is terminating your SSL connections such that it slightly corrupts the headers.
If so, there is an option that reduces the security of HAProxy, but allows less compliant HTTP traffic through. See https://stackoverflow.com/questions/39286346/extra-space-in-http-headers-gives-400-error-on-haproxy
While I don't advocate reducing security as a final fix, this might allow you to maintain the service while you're debugging it.
@ku1ik commented on GitHub (Jul 28, 2017):
@peterbrittain at the moment asciinema.org uses Brightbox Cloud load balancer, so I don't control their Haproxy config. We used to terminate SSL in our own Nginx and that was working fine. Since I switched to BB LB this problem occurs (for some). Are you experiencing it under CentOS, or other system?
Frankly, I haven't had any problem with the previous Nginx-based solution. SSL certificate we had was expiring so I thought I'll go with Let's Encrypt. Since LE certs are short-lived they are best managed automatically and Brightbox LB does that for me. I just wanted to save myself work in setting LE up and BB LB seemed to be the simplest solution (since asciinema.org is sponsored by Brightbox and runs on their great infrastructure). Now I think setting up LE myself in Nginx would probably take 1/10 of the time I already spent trouble-shooting this issue 😞😞😞
@peterbrittain commented on GitHub (Jul 28, 2017):
Ah. I didn't spot the subtlety of who owned which bits. Have you had any luck getting diags from BB for this issue?
And in answer to your question: my box is a CentOS 6 VM.
@ThomasWaldmann commented on GitHub (Aug 14, 2017):
I also just experienced the bad request issue, using asciinema 1.2.0 (version from ubuntu 16.04 lts).
The curl hack given above worked, thanks.
@benaryorg commented on GitHub (Aug 15, 2017):
I just discovered that the very same file does yield a bad request on my Gentoo[1] box, but not on my OpenBSD[2] box.
The OpenBSD uploads it just fine.
I think there should be further investigation into the difference between these clients.
The Gentoo box supports the following Python targets per ebuild:
I can't currently test python3.5 easily though, but maybe this does help already.
Edit: I added the OpenSSL versions, completely forgot about those.
[1]: Gentoo GNU/Linux
[2]: OpenBSD 6.1
@ku1ik commented on GitHub (Aug 20, 2017):
I've just switched back to previous config (terminating SSL in Nginx). Let me know if it works for you now @andyone @ThiefMaster @benaryorg @peterbrittain @ThomasWaldmann
@benaryorg commented on GitHub (Aug 20, 2017):
@sickill I'm only 85% sure it's the same file that failed before, but if it is, you've fixed it.
@andyone commented on GitHub (Aug 20, 2017):
@sickill Works like a charm for me now. 👍
@ThomasWaldmann commented on GitHub (Aug 21, 2017):
Yup, works for me (with
asciinema upload) also now. Thanks!