Donmai

ERROR: canceling statement due to statement timeout

Posted under Bugs & Features

@Question2 @tapnek
I was a little curious, so I did a little experimenting while logged out of my account and it seems that the same issue is seen with the ry tag as well. I looked at the ry tag because it also has a character length of 2, and is the second such artist tag in order of post count (A1 has 2045, ry has 433). I think it might have something to do with the shortness of the tag, and the number of instances that have that item tagged.

I think tag matching is done by checking tag strings to see if they contain the tag being searched for. (I haven't looked deeply into the database setup or tag searching code, so I could be wrong.) If that's the case, it makes sense that shorter tags take longer, because string searching is generally slower when searching for a short needle. (It's much harder to find a small needle in a large haystack than a large needle in a small haystack.) The reason this wouldn't be a problem with tags like hat is that there are a lot of recent posts with it, so the search finds enough posts to fill the page and terminates quickly.

So my guess is that short tags with infrequent posts are the worst case. It probably doesn't matter how many posts a tag has, but rather how many recent posts it has (where more is actually better).

The total time taken for a search would be something like <time to check each post for the tag> * (1 / <probability that each post has the tag>) * <page size>. That value has to be less than the timeout. So in the case where the tag is very short (making the first term large) and the number of recent posts is low (making the second term large), pretty much the only thing you can do is make the third term smaller. For example, adding &limit=5 to the URL or limit:5 to the search would be less likely to time out.

BrokenEagle98 said:

@Question2 @tapnek
I was a little curious, so I did a little experimenting while logged out of my account and it seems that the same issue is seen with the ry tag as well. I looked at the ry tag because it also has a character length of 2, and is the second such artist tag in order of post count (A1 has 2045, ry has 433). I think it might have something to do with the shortness of the tag, and the number of instances that have that item tagged.

Maybe it's related to the huge gaps that both a1 and ry have. Look at the dates of posts under the tags.
For a1, the first post was uploaded 3 months ago. The 20th post was uploaded over 2 years ago.
For ry, the first post was uploaded 2 months ago. The 20th post was uploaded over 2 years ago.
itou_noiji is another example with gaps like that.

I don't know the specifics of database performance, but I guess that having to scan over gaps of hundreds of thousands of posts to get the right results can make it slow.

If you use the sequential paginator to get past the gaps, it's faster. Example. The difference between the first and last post on that page is only a couple months, no multiple year gaps.

☆♪ said:

I think tag matching is done by checking tag strings to see if they contain the tag being searched for. (I haven't looked deeply into the database setup or tag searching code, so I could be wrong.) If that's the case, it makes sense that shorter tags take longer, because string searching is generally slower when searching for a short needle. (It's much harder to find a small needle in a large haystack than a large needle in a small haystack.) The reason this wouldn't be a problem with tags like hat is that there are a lot of recent posts with it, so the search finds enough posts to fill the page and terminates quickly.

Searches are done with a tag_index field which is indexed with gin. The tag_string field itself isn't used for searching.

Toks said:

Searches are done with a tag_index field which is indexed with gin. The tag_string field itself isn't used for searching.

Ah, well that does make a lot more sense. Guess I should have looked before I opened my mouth... I was really surprised when I saw that tags were stored as a string, but never bothered to figure out how it worked. Didn't know about GINs. Learned something new today!

So I guess it's just infrequent posts that matters. Limit should still help (along with sequential paginator).

Could this be the same problem I reported 7.2 and 5.4 months ago where the answer was to just increase the number of posts per page because for whatever reason it fails when using only 20? Assuming we all have the same number of forum posts per page, it was here and here.

Updated

OOZ662 said:

Could this be the same problem I reported 7.2 and 5.4 months ago where the answer was to just increase the number of posts per page because for whatever reason it fails when using only 20? Assuming we all have the same number of forum posts per page, it was here and here.

Interesting, it seems increasing posts per page does make the searches mentioned in this thread fast. As of right now, the numbers that worked were:
a1: 67 or higher
ry: 12 or higher
itou_noiji: 35 or higher

1