By the way, where do you guys think upload tags report link should be placed in user profile? I see options to put it in subnavbar or near upload count, but can't decide.
Posted under General
This topic has been locked.
I'd prefer it next to the upload count.
Example:
Uploads 7669 (Tag Change Report)
I thought about just Report, but that could be misconstrued as reporting that user for administrative action. Uploads Report is a bit awkward (for me at least), seeing as how the word "Uploads" is already used in the row heading.
I don't often upload content from Tumblr, but I'm having issues uploading PNG files. I just tried using the bookmarklet to upload the following exact image URL http://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png and instead got a JPEG for post #2669449. I'm going to manually upload the PNG, since there is a difference on inspection between the two images. Is this a known issue for Tumblr?
Well this is bizarre. I tried downloading the file several times and sometimes it gives me a 2.7M PNG and sometimes it gives me a 648K JPEG. It seems random which file I get. No idea what's going on here.
admin@icarus:~ % wget https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:19:23-- https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Resolving 68.media.tumblr.com (68.media.tumblr.com)... 69.147.82.57, 216.115.96.175, 69.147.82.56, ... Connecting to 68.media.tumblr.com (68.media.tumblr.com)|69.147.82.57|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2844421 (2.7M) [image/png] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ tumblr_okf33cpXvt1v2gw9ko1_1280.png 100%[==========================================================================================================================================>] 2.71M 1.58MB/s in 1.7s 2017-03-24 03:19:26 (1.58 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ saved [2844421/2844421] admin@icarus:~ % wget https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:19:33-- https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Resolving 68.media.tumblr.com (68.media.tumblr.com)... 216.115.96.179, 69.147.82.56, 216.115.96.175, ... Connecting to 68.media.tumblr.com (68.media.tumblr.com)|216.115.96.179|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 663298 (648K) [image/jpeg] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 100%[==========================================================================================================================================>] 647.75K 1.12MB/s in 0.6s 2017-03-24 03:19:34 (1.12 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ saved [663298/663298] admin@icarus:~ % identify tumblr_okf33cpXvt1v2gw9ko1_1280.png tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 tumblr_okf33cpXvt1v2gw9ko1_1280.png PNG 1280x1819 1280x1819+0+0 8-bit sRGB 2.844MB 0.000u 0:00.000 tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 JPEG 1280x1819 1280x1819+0+0 8-bit sRGB 663KB 0.000u 0:00.000
EDIT: it seems to be based on the IP of the server. http://68.media.tumblr.com has 4 IPs:
% drill 68.media.tumblr.com ;; ANSWER SECTION: 68.media.tumblr.com. 23016 IN CNAME edge2.gycs.b.yahoodns.net. edge2.gycs.b.yahoodns.net. 8 IN A 69.147.82.57 edge2.gycs.b.yahoodns.net. 8 IN A 216.115.96.175 edge2.gycs.b.yahoodns.net. 8 IN A 216.115.96.179 edge2.gycs.b.yahoodns.net. 8 IN A 69.147.82.56
http://69.147.82.57 returns the PNG:
admin@icarus:~ % wget --header "Host: 68.media.tumblr.com" http://69.147.82.57/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:37:04-- http://69.147.82.57/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Connecting to 69.147.82.57:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2844421 (2.7M) [image/png] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ tumblr_okf33cpXvt1v2gw9ko1_1280.png 100%[==========================================================================================================================================>] 2.71M 1.40MB/s in 1.9s 2017-03-24 03:37:06 (1.40 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ saved [2844421/2844421]
And http://216.115.96.179 returns the JPEG:
% wget --header "Host: 68.media.tumblr.com" http://216.115.96.179/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:37:31-- http://216.115.96.179/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Connecting to 216.115.96.179:80... connected. HTTP request sent, awaiting response... 200 OK Length: 663298 (648K) [image/jpeg] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 100%[==========================================================================================================================================>] 647.75K 1.16MB/s in 0.5s 2017-03-24 03:37:32 (1.16 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ saved [663298/663298]
Updated
evazion said:
Well this is bizarre. I tried downloading the file several times and sometimes it gives me a 2.7M PNG and sometimes it gives me a 648K JPEG. It seems random which file I get. No idea what's going on here.
Maybe have a look at post #2651342 and one of its child posts as well? Same behavior there. I believe the deleted parent is the better quality one, actually.
@BrokenEagle98 did your bot break in some way? It points to :large here post #2639755
Randeel said:
@BrokenEagle98 did your bot break in some way? It points to :large here post #2639755
Ah, thanks for the point out. It encountered a server error 500-599, and it's supposed to sleep and try again after a minute, but instead it was just skipping to the next size.
Just for reference, if an HTTP response returns an error code 400-499, my script will try the next available size. I have encountered Twitter images where the ":orig" size 404'd but the ":large" was still available. Tumblr is another site where not all sizes are available. It's sort of built into Artstation.
evazion said:
Well this is bizarre. I tried downloading the file several times and sometimes it gives me a 2.7M PNG and sometimes it gives me a 648K JPEG. It seems random which file I get. No idea what's going on here.
Show
admin@icarus:~ % wget https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:19:23-- https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Resolving 68.media.tumblr.com (68.media.tumblr.com)... 69.147.82.57, 216.115.96.175, 69.147.82.56, ... Connecting to 68.media.tumblr.com (68.media.tumblr.com)|69.147.82.57|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2844421 (2.7M) [image/png] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ tumblr_okf33cpXvt1v2gw9ko1_1280.png 100%[==========================================================================================================================================>] 2.71M 1.58MB/s in 1.7s 2017-03-24 03:19:26 (1.58 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ saved [2844421/2844421] admin@icarus:~ % wget https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:19:33-- https://68.media.tumblr.com/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Resolving 68.media.tumblr.com (68.media.tumblr.com)... 216.115.96.179, 69.147.82.56, 216.115.96.175, ... Connecting to 68.media.tumblr.com (68.media.tumblr.com)|216.115.96.179|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 663298 (648K) [image/jpeg] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 100%[==========================================================================================================================================>] 647.75K 1.12MB/s in 0.6s 2017-03-24 03:19:34 (1.12 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ saved [663298/663298] admin@icarus:~ % identify tumblr_okf33cpXvt1v2gw9ko1_1280.png tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 tumblr_okf33cpXvt1v2gw9ko1_1280.png PNG 1280x1819 1280x1819+0+0 8-bit sRGB 2.844MB 0.000u 0:00.000 tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 JPEG 1280x1819 1280x1819+0+0 8-bit sRGB 663KB 0.000u 0:00.000EDIT: it seems to be based on the IP of the server. http://68.media.tumblr.com has 4 IPs:
% drill 68.media.tumblr.com ;; ANSWER SECTION: 68.media.tumblr.com. 23016 IN CNAME edge2.gycs.b.yahoodns.net. edge2.gycs.b.yahoodns.net. 8 IN A 69.147.82.57 edge2.gycs.b.yahoodns.net. 8 IN A 216.115.96.175 edge2.gycs.b.yahoodns.net. 8 IN A 216.115.96.179 edge2.gycs.b.yahoodns.net. 8 IN A 69.147.82.56http://69.147.82.57 returns the PNG:
admin@icarus:~ % wget --header "Host: 68.media.tumblr.com" http://69.147.82.57/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:37:04-- http://69.147.82.57/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Connecting to 69.147.82.57:80... connected. HTTP request sent, awaiting response... 200 OK Length: 2844421 (2.7M) [image/png] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ tumblr_okf33cpXvt1v2gw9ko1_1280.png 100%[==========================================================================================================================================>] 2.71M 1.40MB/s in 1.9s 2017-03-24 03:37:06 (1.40 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png’ saved [2844421/2844421]And http://216.115.96.179 returns the JPEG:
% wget --header "Host: 68.media.tumblr.com" http://216.115.96.179/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png --2017-03-24 03:37:31-- http://216.115.96.179/ff41206113b69a05283784a121a9190a/tumblr_okf33cpXvt1v2gw9ko1_1280.png Connecting to 216.115.96.179:80... connected. HTTP request sent, awaiting response... 200 OK Length: 663298 (648K) [image/jpeg] Saving to: ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ tumblr_okf33cpXvt1v2gw9ko1_1280.png.1 100%[==========================================================================================================================================>] 647.75K 1.16MB/s in 0.5s 2017-03-24 03:37:32 (1.16 MB/s) - ‘tumblr_okf33cpXvt1v2gw9ko1_1280.png.1’ saved [663298/663298]
@BrokenEagle98, you might want to check how your bot handles Tumblr-sourced images. It also appears to be affected by random image retrieval (post #2669456).
D1ce said:
@BrokenEagle98, you might want to check how your bot handles Tumblr-sourced images. It also appears to be affected by random image retrieval (post #2669456).
Does it affect all servers, or only the 68 one?
On a side note, I've thought of having a developer's thread for a while now for stuff like the above. A while back I asked Type-kun about it and he was okay with it as long as it remains Danbooru related.
Thoughts?
BrokenEagle98 said:
Does it affect all servers, or only the 68 one?
On a side note, I've thought of having a developer's thread for a while now for stuff like the above. A while back I asked Type-kun about it and he was okay with it as long as it remains Danbooru related.
Thoughts?
I don't upload enough from there to say for certain, but the posts that gave me trouble came from the 68 one. I would think that a global catchall script for all Tumblr uploads would be safer, considering that the problem could later occur on their other servers. Would such a check be too impactful on performance?
It's not a question of performance...
There are two unsolvable issues (as far as I'm aware of)...
From what I've read, Tumblr uses a rotating list of servers from Yahoo that can vary from hour to hour. I've been unable to find a method to determine this at run time.
From initial testing, both with my browser (Chrome 56) and with my script (Python 3.5), I am unable to use the IP address to get the image file. It only works when the hostname is used...??? It doesn't make a lot of sense since the headers I'm sending are exactly the same.
BrokenEagle98 said:
2. Being able to use those IP addresses dynamically.
From initial testing, both with my browser (Chrome 56) and with my script (Python 3.5), I am unable to use the IP address to get the image file. It only works when the hostname is used...??? It doesn't make a lot of sense since the headers I'm sending are exactly the same.
Not sure if I understood you correctly, but if you use http://0.1.2.3/file instead of http://example.com/file then you’re not sending the same headers. Nowadays, most webservers support hosting more than one site on the same IP, so you need to send the correct hostname to tell the server which site you want. This often applies even if the server serves only one site. The only way around that is to tell your client to connect to a specific IP but send the correct hostname instead of the IP as part of the request.
If you use curl, you can try the --resolve <host:port:address>
option. I expect Python to be able to do it too, but I have no idea how easy that is.
Btw, consider sending HEAD requests to alternate hosts to check the file size and possibly modification date to avoid downloading the same file multiple times.
Updated
The resized version of post #4087, post #1357 and post #930 is broken, not sure if anything can be done about it.
I made issue #2938 for the Tumblr bug.
On a side note, I've thought of having a developer's thread for a while now for stuff like the above. A while back I asked Type-kun about it and he was okay with it as long as it remains Danbooru related.
This would be a good idea. This thread is hard to follow as it stands.
The resized version of post #4087, post #1357 and post #930 is broken, not sure if anything can be done about it.
I think this is related to issue #2500. Images for older posts are slowly being migrated offsite to save disk space.
evazion said:
I think this is related to issue #2500. Images for older posts are slowly being migrated offsite to save disk space.
That may be true, I just came across them when going through old tags I have favorited so I thought I'd report them to be on the safe side since they are very old.
On another note, after cleaning up all (?) posts I found in solo solo_focus, the search only gives me this error.
PG::QueryCanceled exception raised ERROR: canceling statement due to statement timeout app/logical/post_sets/post.rb:126:in `posts' app/controllers/posts_controller.rb:15:in `index'
Updated
kittey said:
Not sure if I understood you correctly, but if you use http://0.1.2.3/file instead of http://example.com/file then you’re not sending the same headers. Nowadays, most webservers support hosting more than one site on the same IP, so you need to send the correct hostname to tell the server which site you want. This often applies even if the server serves only one site. The only way around that is to tell your client to connect to a specific IP but send the correct hostname instead of the IP as part of the request.
Many thanks for the above advice, as it worked correctly when I set the Host field to "68.media.tumblr.com". It was a bit confusing because the requests python module that I use allows you to investigate the headers of the send request, and it wasn't setting the Host field so I thought it was unimportant... Maybe it was doing it and just not reporting it...?
Btw, consider sending HEAD requests to alternate hosts to check the file size and possibly modification date to avoid downloading the same file multiple times.
I've learned the hard way that servers often don't respond in the same manner to HEAD requests as GET requests... :/ Additionally, I've discovered that most of the servers lie about the modification datetime, usually setting it to the moment of the GET request.
Regardless, I'm still left with the problem of #1, i.e. determining all of the server IP's at runtime. For instance, all of the methods I've used to determine the IP (NSLOOKUP, http://centralops.net/co/ , python, etc.) hasn't yet returned the faulty IP of 216.115.96.179.
Updated
Unbreakable said:
On another note, after cleaning up all (?) posts I found in solo solo_focus, the search only gives me this error.
This is longstanding problem, discussed in issue #1039. Basically, the problem is that for mutually exclusive tags (tags that independently have a large number of posts, but that when combined together produce few-to-no results), searches become very slow and likely to time out. Some other examples: solo multiple_girls, banned_artist -status:banned, highres lowres, long_hair no_humans.
There is a trick, however: add order:score to the search and like magic it won't time out. The reason why this works is fairly involved, but the basic explanation is that order:score tricks the database into searching using a different strategy than it normally would, which happens to be faster for these searches.
BrokenEagle98 said:
Regardless, I'm still left with the problem of #1, i.e. determining all of the server IP's at runtime. For instance, all of the methods I've used to determine the IP (NSLOOKUP, http://centralops.net/co/ , python, etc.) hasn't yet returned the faulty IP of 216.115.96.179.
It seems that their DNS gives you different sets of IPs based on your geographic location. I've tested the lookup on three machines and they each see different sets of IPs. On one machine, all of the IPs it gave me returned the JPEG. So it looks like even downloading the file from each IP isn't guaranteed to find the best file.