[GH-ISSUE #67] Uploading large files causes OverflowError #35

Closed
opened 2026-02-27 15:46:07 +03:00 by kerem · 11 comments
Owner

Originally created by @aelth on GitHub (Sep 28, 2021).
Original GitHub issue: https://github.com/proxmoxer/proxmoxer/issues/67

Originally assigned to: @jhollowe on GitHub.

Hello,

I was trying to upload an ISO file (large one, larger than 5 GBs) and following the example from the documentation, I got an error:

storage.upload.create(content='iso', filename=open(iso_path, 'rb'))
...
OverflowError: string longer than 2147483647 bytes

By following the stack trace, I realized that the wrong argument type is being used in the create/POST call.
When filename argument is being used, proxmoxer https backend transforms it into a files parameter for requests (basically, a dictionary):

> /tmp/env/lib/python3.9/site-packages/proxmoxer/backends/https.py(181)request()
-> return super(ProxmoxHttpSession, self).request(method, url, params, data, headers, cookies, files, auth,
(Pdb) url
'https://prox:8006/api2/json/nodes/prox/storage/local/upload'
(Pdb) method
'POST'
(Pdb) data
{'content': 'iso'}
(Pdb) files
{'filename': <_io.BufferedReader name='/path/to/iso/example.iso'>}

Since the requests' module parameter should be file and not filename, the following call succeeds:

storage.upload.create(content='iso', file=open(iso_path, 'rb'))

I suggest you test this and update the documentation if this is really the case in general, or I'm having some weird quirks:)

Also, download-url method does not work, it simply isn't parsed properly and it's dynamic generation fails.

Originally created by @aelth on GitHub (Sep 28, 2021). Original GitHub issue: https://github.com/proxmoxer/proxmoxer/issues/67 Originally assigned to: @jhollowe on GitHub. Hello, I was trying to upload an ISO file (large one, larger than 5 GBs) and following the example from the documentation, I got an error: ``` storage.upload.create(content='iso', filename=open(iso_path, 'rb')) ... OverflowError: string longer than 2147483647 bytes ``` By following the stack trace, I realized that the wrong argument type is being used in the `create`/`POST` call. When `filename` argument is being used, proxmoxer https backend transforms it into a `files` parameter for requests (basically, [a dictionary](https://docs.python-requests.org/en/latest/user/quickstart/#post-a-multipart-encoded-file)): ``` > /tmp/env/lib/python3.9/site-packages/proxmoxer/backends/https.py(181)request() -> return super(ProxmoxHttpSession, self).request(method, url, params, data, headers, cookies, files, auth, (Pdb) url 'https://prox:8006/api2/json/nodes/prox/storage/local/upload' (Pdb) method 'POST' (Pdb) data {'content': 'iso'} (Pdb) files {'filename': <_io.BufferedReader name='/path/to/iso/example.iso'>} ``` Since the requests' module parameter should be **`file`** and not **`filename`**, the following call succeeds: ``` storage.upload.create(content='iso', file=open(iso_path, 'rb')) ``` I suggest you test this and update the documentation if this is really the case in general, or I'm having some weird quirks:) Also, `download-url` method does not work, it simply isn't parsed properly and it's dynamic generation fails.
Author
Owner

@aelth commented on GitHub (Sep 28, 2021):

A small update - I was wrong about the file argument, it doesn't work at all with file and with filename, it fails for large files.
If I leave file argument, there is no overflow error shown (from the requests part), but although the file is seemingly being uploaded correctly, it fails because the proxmox API handler expects filename argument:

1147 sub file_upload_multipart {
1148     my ($self, $reqstate, $auth, $method, $path, $rstate) = @_;
1149 
1150     eval {
1151         my $boundary = $rstate->{boundary};
1152         my $hdl = $reqstate->{hdl};
1153 
1154         my $startlen = length($hdl->{rbuf});
1155 
1156         if ($rstate->{phase} == 0) { # skip everything until start
1157             if ($hdl->{rbuf} =~ s/^.*?--\Q$boundary\E  \015?\012
1158                        ((?:[^\015]+\015\012)* ) \015?\012//xs) {
1159                 my $header = $1;
1160                 my ($ct, $disp, $name, $filename);
1161                 foreach my $line (split(/\015?\012/, $header)) {
1162                     # assume we have single line headers
1163                     if ($line =~ m/^Content-Type\s*:\s*(.*)/i) {
1164                         $ct = parse_content_type($1);
1165                     } elsif ($line =~ m/^Content-Disposition\s*:\s*(.*)/i) {
1166                         ($disp, $name, $filename) = parse_content_disposition($1);
1167                     }
1168                 }
1169 
1170                 if (!($disp && $disp eq 'form-data' && $name)) {
1171                     syslog('err', "wrong content disposition in multipart - abort upload");
1172                     $rstate->{phase} = -1;
1173                 } else {
1174 
1175                     $rstate->{fieldname} = $name;
1176 
1177                     if ($filename) {
1178                         if ($name eq 'filename') {
1179                             # found file upload data
1180                             $rstate->{phase} = 1;
1181                             $rstate->{filename} = $filename;
1182                         } else {
1183                             syslog('err', "wrong field name for file upload - abort upload");
1184                             $rstate->{phase} = -1;
<!-- gh-comment-id:929666311 --> @aelth commented on GitHub (Sep 28, 2021): A small update - I was wrong about the `file` argument, it doesn't work at all with `file` and with `filename`, it fails for large files. If I leave `file` argument, there is no overflow error shown (from the requests part), but although the file is seemingly being uploaded correctly, it fails because the proxmox API handler [expects `filename` argument](https://git.proxmox.com/?p=pve-http-server.git;a=blob;f=src/PVE/APIServer/AnyEvent.pm;h=cd778064ff35c96aed714259734baa58224b6de1;hb=HEAD#l1178): ``` 1147 sub file_upload_multipart { 1148 my ($self, $reqstate, $auth, $method, $path, $rstate) = @_; 1149 1150 eval { 1151 my $boundary = $rstate->{boundary}; 1152 my $hdl = $reqstate->{hdl}; 1153 1154 my $startlen = length($hdl->{rbuf}); 1155 1156 if ($rstate->{phase} == 0) { # skip everything until start 1157 if ($hdl->{rbuf} =~ s/^.*?--\Q$boundary\E \015?\012 1158 ((?:[^\015]+\015\012)* ) \015?\012//xs) { 1159 my $header = $1; 1160 my ($ct, $disp, $name, $filename); 1161 foreach my $line (split(/\015?\012/, $header)) { 1162 # assume we have single line headers 1163 if ($line =~ m/^Content-Type\s*:\s*(.*)/i) { 1164 $ct = parse_content_type($1); 1165 } elsif ($line =~ m/^Content-Disposition\s*:\s*(.*)/i) { 1166 ($disp, $name, $filename) = parse_content_disposition($1); 1167 } 1168 } 1169 1170 if (!($disp && $disp eq 'form-data' && $name)) { 1171 syslog('err', "wrong content disposition in multipart - abort upload"); 1172 $rstate->{phase} = -1; 1173 } else { 1174 1175 $rstate->{fieldname} = $name; 1176 1177 if ($filename) { 1178 if ($name eq 'filename') { 1179 # found file upload data 1180 $rstate->{phase} = 1; 1181 $rstate->{filename} = $filename; 1182 } else { 1183 syslog('err', "wrong field name for file upload - abort upload"); 1184 $rstate->{phase} = -1; ```
Author
Owner

@jhollowe commented on GitHub (Sep 30, 2021):

This is just a limitation of requests. I'm looking into trying to use requests_toolbox's MultipartEncoder to upload larger files. I'll update this thread with progress

<!-- gh-comment-id:930785331 --> @jhollowe commented on GitHub (Sep 30, 2021): This is just a limitation of requests. I'm looking into trying to use requests_toolbox's MultipartEncoder to upload larger files. I'll update this thread with progress
Author
Owner

@jhollowe commented on GitHub (Sep 30, 2021):

@aelth what versions are you using (python, PVE, and proxmoxer)?

<!-- gh-comment-id:931363112 --> @jhollowe commented on GitHub (Sep 30, 2021): @aelth what versions are you using (python, PVE, and proxmoxer)?
Author
Owner

@jhollowe commented on GitHub (Sep 30, 2021):

It appears that requests already will multipart encode the data, so requests_toolbox's MultipartEncoder is redundant.

The issue is that the call to requests is not streaming the file but is attempting to send the entire file in a single send which causes issues in the SSL layer due to a 32bit signed int size limit (the 2147483647 bytes).

The SSL error will hopefully be fixed soon, but this is really just an indication of requests (or at least how we are using requests) trying to read everything into memory and then sending. I'll keep working on it. In the meantime, you can use the download-url endpoint to pull the ISO from an external source.

References:
Issue for SSL
Issue for requests

<!-- gh-comment-id:931430527 --> @jhollowe commented on GitHub (Sep 30, 2021): It appears that requests already will multipart encode the data, so requests_toolbox's MultipartEncoder is redundant. The issue is that the call to requests is not streaming the file but is attempting to send the entire file in a single send which causes issues in the SSL layer due to a 32bit signed int size limit (the 2147483647 bytes). The SSL error will hopefully be fixed soon, but this is really just an indication of requests (or at least how we are using requests) trying to read everything into memory and then sending. I'll keep working on it. In the meantime, you can use the [download-url](https://pve.proxmox.com/pve-docs/api-viewer/index.html#/nodes/{node}/storage/{storage}/download-url) endpoint to pull the ISO from an external source. References: [Issue for SSL](https://bugs.python.org/issue42853) [Issue for requests](https://github.com/psf/requests/issues/2717)
Author
Owner

@jhollowe commented on GitHub (Sep 30, 2021):

Also, download-url method does not work, it simply isn't parsed properly and it's dynamic generation fails.

look at #61 for how to deal with paths that contain hyphens.

<!-- gh-comment-id:931430981 --> @jhollowe commented on GitHub (Sep 30, 2021): > Also, download-url method does not work, it simply isn't parsed properly and it's dynamic generation fails. look at #61 for how to deal with paths that contain hyphens.
Author
Owner

@aelth commented on GitHub (Oct 2, 2021):

My environment is the following:

  • Python 3.9.7
  • proxmoxer 1.1.1 and requests 2.26.0
  • PVE 7.0-11 (with all patches applied)

Thanks for the help regarding the download-url method - the proposed alternative works, but the problem with this call is that it's asynchronous and I need to poll/check whether the ISO has been uploaded successfully and it adds additional complexity if the file being uploaded is local (I have to create a simple web server).

I'm still experimenting with download-url.

Thanks!

<!-- gh-comment-id:932762913 --> @aelth commented on GitHub (Oct 2, 2021): My environment is the following: - Python 3.9.7 - proxmoxer 1.1.1 and requests 2.26.0 - PVE 7.0-11 (with all patches applied) Thanks for the help regarding the `download-url` method - the proposed alternative works, but the problem with this call is that it's asynchronous and I need to poll/check whether the ISO has been uploaded successfully and it adds additional complexity if the file being uploaded is local (I have to create a simple web server). I'm still experimenting with `download-url`. Thanks!
Author
Owner

@jhollowe commented on GitHub (Oct 2, 2021):

@aelth if you are able, check the PR #68 and see if that resolves the issue for you. You will need to pip install requests_toolbelt if you don't already have it. I'd like someone else to test it before I merge it.

<!-- gh-comment-id:932827997 --> @jhollowe commented on GitHub (Oct 2, 2021): @aelth if you are able, check the PR #68 and see if that resolves the issue for you. You will need to `pip install requests_toolbelt` if you don't already have it. I'd like someone else to test it before I merge it.
Author
Owner

@aelth commented on GitHub (Oct 4, 2021):

@jhollowe I can confirm that this PR indeed solves the issue for me!
Large ISO files (> 2 GB), as well as small ISOs are uploaded successfully using upload call and filename parameter.

Thank you very much for your assistance.

P.S. You can close the ticket, but I would like to use this opportunity to ask whether you have some ideas regarding the download-url call. It works, but the HTTP call returns immediately with the task ID.
If I want to serve local ISOs, is my best option to use a simple HTTP server and poll the returned task ID status from Proxmox VE until the task is finished?

Thanks,
aelth

<!-- gh-comment-id:933869705 --> @aelth commented on GitHub (Oct 4, 2021): @jhollowe I can confirm that this PR indeed solves the issue for me! Large ISO files (> 2 GB), as well as small ISOs are uploaded successfully using `upload` call and `filename` parameter. Thank you very much for your assistance. P.S. You can close the ticket, but I would like to use this opportunity to ask whether you have some ideas regarding the `download-url` call. It works, but the HTTP call returns immediately with the task ID. If I want to serve local ISOs, is my best option to use a simple HTTP server and poll the returned task ID status from Proxmox VE until the task is finished? Thanks, aelth
Author
Owner

@jhollowe commented on GitHub (Oct 5, 2021):

yeah, you can use the task ID to poll for the task status from the nodes/{node}/tasks/{upid}/status endpoint. you could theoretically pull the log from the task, parse it, and weight your polling frequency based on how close to being done it is, but that would be (imho) more effort than it is worth. Now that the upload works, the only benefit I can see from download-url is the hash check, but if you are pulling from your local machine, you should trust the source (your machine) is providing the expected file.

So if the upload doesn't work, you could do the following steps:

  1. Use something like http.server to serve the ISO/template you want to transfer
  2. send a download-url API call to download the file from your machine
  3. query if the download-url task is complete
  4. if not complete, goto # 3, else continue
<!-- gh-comment-id:934666507 --> @jhollowe commented on GitHub (Oct 5, 2021): yeah, you can use the task ID to poll for the task status from the [nodes/{node}/tasks/{upid}/status](https://pve.proxmox.com/pve-docs/api-viewer/#/nodes/{node}/tasks/{upid}/status) endpoint. you could theoretically pull the log from the task, parse it, and weight your polling frequency based on how close to being done it is, but that would be (imho) more effort than it is worth. Now that the upload works, the only benefit I can see from `download-url` is the hash check, but if you are pulling from your local machine, you should trust the source (your machine) is providing the expected file. So if the upload doesn't work, you could do the following steps: 1. Use something like `http.server` to serve the ISO/template you want to transfer 2. send a `download-url` API call to download the file from your machine 3. query if the `download-url` task is complete 4. if not complete, goto # 3, else continue
Author
Owner

@jhollowe commented on GitHub (Oct 5, 2021):

I'm thinking it might be cool to add a new class that is a helper class for common tasks. So ideally you could pass it a task ID and it would block until it was complete and return the result and something similar for other qemu's exec.

<!-- gh-comment-id:934670099 --> @jhollowe commented on GitHub (Oct 5, 2021): I'm thinking it might be cool to add a new class that is a helper class for common tasks. So ideally you could pass it a task ID and it would block until it was complete and return the result and something similar for other qemu's exec.
Author
Owner

@aelth commented on GitHub (Oct 6, 2021):

I'm thinking it might be cool to add a new class that is a helper class for common tasks. So ideally you could pass it a task ID and it would block until it was complete and return the result and something similar for other qemu's exec.

Yes, that sounds great!

I used a simple polling loop, but it would be great if proxmoxer had a helper class for such common tasks.

Thanks again for helping and solving the issue quickly!

Best regards,
aelth

<!-- gh-comment-id:936985808 --> @aelth commented on GitHub (Oct 6, 2021): > I'm thinking it might be cool to add a new class that is a helper class for common tasks. So ideally you could pass it a task ID and it would block until it was complete and return the result and something similar for other qemu's exec. Yes, that sounds great! I used a simple polling loop, but it would be great if proxmoxer had a helper class for such common tasks. Thanks again for helping and solving the issue quickly! Best regards, aelth
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/proxmoxer#35
No description provided.