I updated my comment with thoughts. Please check it out.
Oh, I didn't know JPEG had this in metadata. Anyway you don't need to download anything, just poke at it with a HEAD request:
curl -sI https://s01.geekpic.net/di-CF3ZOA.png | grep content-length
Output: content-length: 193925
PS: Maybe the HTTP spec requires 'Content-Length' - pay attention to case-sensitivity
The problem with that command is that you have to download the image in its entirety to get its size.
Also poal uses python and JS only.
The code poal uses downloads the header only.
No no no no no!
HEAD
The HEAD method asks for a response identical to that of a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.
Nothing's downloaded. The entire response (Wireshark, "Length" field) is 903 bytes and only contains the headers. You can test the same URL with http://
to see the unencrypted data in Wireshark (and as I suspected the capitalization of letters changed.)
I don't know what library you are using, but on the inside, all that's needed is to replace "GET /di-CF3ZOA.png" in the HTTP request with "HEAD /di-CF3ZOA.png" to get the headers.
https://docs.python.org/3/library/http.client.html#http.client.HTTPConnection.request
or https://stackoverflow.com/questions/107405/how-do-you-send-a-head-http-request-in-python-2
PS/Edit: Maybe I should have specified, curl -I
is curl --head
aka -I, --head Show document info only
, the HTTP HEAD request
Maybe I should have specified, curl -I
Ok didn't know about -I
I use requests library to get headers.
Everything works perfectly fine except for some corrupt jpeg files.
I uploaded one, you can try it with your code if you want and see if you get the same results.
(post is archived)