zlacker

[parent] [thread] 16 comments
1. Pengui+(OP)[view] [source] 2023-08-15 17:28:28
Additional details I wrangled for this rabbit hole. I don't think it's t.co doing this intentionally, but rather poor handling of 'do you have our cookies or not'. Everyone in this thread _proving things_ without taking into account the complexity of the modern web.

   man curl
       -b, --cookie <data|filename>
              (HTTP) Pass the data to the HTTP server in the Cookie header. It is supposedly the data previously received from the server in a "Set-Cookie:" line.
----

Add that option to your curl tests.

    ---
    $ time curl -s -b -A "curl/8.2.1" -e ";auto" -L https://t.co/4fs609qwWt -o /dev/null | sha256sum 
    eb9996199e81c3b966fa3d2e98e126516dfdd31f214410317f5bdcc3b241b6a2  -

    real    0m1.245s
    user    0m0.087s
    sys     0m0.034s
    ---

    $ time curl -s -b -e ";auto" -L https://t.co/4fs609qwWt -o /dev/null | sha256sum 
    eb9996199e81c3b966fa3d2e98e126516dfdd31f214410317f5bdcc3b241b6a2  -

    real    0m1.265s
    user    0m0.103s
    sys     0m0.023s
    ---

    $ time curl -s -b -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/4fs609qwWt -o /dev/null | sha256sum 
    eb9996199e81c3b966fa3d2e98e126516dfdd31f214410317f5bdcc3b241b6a2  -

    real    0m1.254s
    user    0m0.100s
    sys     0m0.018
    ---
replies(5): >>ChrisA+Ob >>ender7+8h >>dymk+Rl >>scient+0t >>mzs+mC
2. ChrisA+Ob[view] [source] 2023-08-15 18:28:23
>>Pengui+(OP)
Amen
replies(1): >>ChrisA+tC
3. ender7+8h[view] [source] 2023-08-15 18:59:56
>>Pengui+(OP)
I can replicate this behavior fairly easily in a browser.

  1. Open incognito window in Chrome
  2. Visit https://t.co/4fs609qwWt -> 5s delay
  3. Open a second tab in the same window -> no delay
  4. Close window, start a new incognito session
  5. Visit https://t.co/4fs609qwWt -> 5s delay returns
replies(2): >>xslowz+Bk >>Pengui+Ln
◧◩
4. xslowz+Bk[view] [source] [discussion] 2023-08-15 19:16:14
>>ender7+8h
The reason there isn't a delay the second click is because the redirect is cached locally in your browser.

Your humble anonymous tipster would appreciate if you do a little legwork.

replies(1): >>hk__2+H33
5. dymk+Rl[view] [source] 2023-08-15 19:20:32
>>Pengui+(OP)
If it's not intentional, why are people observing different behavior (no delay) for other domains, but a delay for NYT, bsky etc then?
◧◩
6. Pengui+Ln[view] [source] [discussion] 2023-08-15 19:30:47
>>ender7+8h
What is that attempting to prove or replicate?

Here's a simpler test I think replicates what I am indicating in GP comment, with regards to cookie handling:

Not passing a cookie to the next stage; pure GET request:

    $ time curl -s -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/4fs609qwWt > nocookie.html

    real    0m4.916s
    user    0m0.016s
    sys     0m0.018s

Using `-b` to pass the cookies _(same command as above, just adding `-b`)_

    $ time curl -s -b -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/4fs609qwWt > withcookie.html

    real    0m1.995s
    user    0m0.083s
    sys     0m0.026s
Look at the differences in the resulting files for 'with' and 'no' cookie. One redirect works in a timely manner. The other takes the ~4-5 seconds to redirect.
replies(2): >>mzs+WC >>lapcat+oJ
7. scient+0t[view] [source] 2023-08-15 19:56:57
>>Pengui+(OP)
Amazing that this poor handling of 'do you have our cookies or not' only affects news papers and social media sites that Elon doesn't like! What a coincidence.
8. mzs+mC[view] [source] 2023-08-15 20:52:38
>>Pengui+(OP)
oh boy... -b takes an option which in your examples is -A and -e, then what follows is interpreted as a URL and you throw away the warnings:

  % curl -vgsSIw'> %{time_total}\n' -b -A "curl/8.2.1" https://t.co/DzIiCFp7Ti 2>&1 | grep '^\(* WARNING: \)\|\(Could not resolve host: \)\|>' 
  * WARNING: failed to open cookie file "-A"
  * Could not resolve host: curl
  curl: (6) Could not resolve host: curl
  * WARNING: failed to open cookie file "-A"
  > HEAD /DzIiCFp7Ti HTTP/2
  > Host: t.co
  > User-Agent: curl/8.1.2
  > Accept: */*
  > 
  > 0.013309
  > 0.112494
replies(1): >>Pengui+EM
◧◩
9. ChrisA+tC[view] [source] [discussion] 2023-08-15 20:53:30
>>ChrisA+Ob
Good work Penguin. I believe in you
◧◩◪
10. mzs+WC[view] [source] [discussion] 2023-08-15 20:56:08
>>Pengui+Ln
In your second example you are passing the cookie file named ./-A then trying to GET the URL "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" followed by https://t.co/4fs609qwWt
◧◩◪
11. lapcat+oJ[view] [source] [discussion] 2023-08-15 21:37:28
>>Pengui+Ln
You're completely missing the point, which is that the 5 second delay doesn't exist at all for most t.co links, even without cookies. The delay only exists for a few Musk-hated domains.
◧◩
12. Pengui+EM[view] [source] [discussion] 2023-08-15 21:56:33
>>mzs+mC
Alright thanks for explaining that . Here's what I see explicitly setting the cookiejar

    $ time curl -s -b cookies.txt -c cookies.txt -A "Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0" -e ";auto" -L https://t.co/DzIiCFp7Ti

    [t.co meta refresh page src]

    real     0m4.635s
    user   0m0.004s
    sys     0m0.008s

    $ time curl -b cookies.txt -c cookies.txt -A "wget/1.23" -e ";auto" -L https://t.co/DzIiCFp7Ti                        curl: (7)
    Failed to connect to www.threads.net port 443:  Connection refused
    real     0m4.635s
    user   0m0.011s
    sys     0m0.005s

    $ time curl -b cookies.txt -c cookies.txt -e ";auto" -L https://t.co/DzIiCFp7Ti                                       curl: (7)
    Failed to connect to www.threads.net port 443 Connection refused
    real     0m0.129s
    user   0m0.000s
    sys     0m0.013s
The failed to connects are threads.net likely blocking those user agents but the timing is there which is different than the first UA attempt.
◧◩◪
13. hk__2+H33[view] [source] [discussion] 2023-08-16 15:25:10
>>xslowz+Bk
> The reason there isn't a delay the second click is because the redirect is cached locally in your browser.

No, because it’s not an HTTP redirect. It’s an HTML page that redirects you using a meta tag, something that the browser doesn’t cache.

replies(1): >>xslowz+kz5
◧◩◪◨
14. xslowz+kz5[view] [source] [discussion] 2023-08-17 07:02:47
>>hk__2+H33
Your humble anonymous tipster notes to their skeptical audience that browsers are capable of caching all sorts of things, even something as peculiar as an HTML page.
replies(1): >>hk__2+uN5
◧◩◪◨⬒
15. hk__2+uN5[view] [source] [discussion] 2023-08-17 09:07:19
>>xslowz+kz5
> browsers are capable of caching all sorts of things, even something as peculiar as an HTML page.

Yes, and this is irrelevant to your previous comment: caching the HTML doesn’t cache the redirect itself.

replies(1): >>xslowz+Bl7
◧◩◪◨⬒⬓
16. xslowz+Bl7[view] [source] [discussion] 2023-08-17 17:17:10
>>hk__2+uN5
You can lead a horse to water, but you can't make him drink. The delay was not on the HTML page.
replies(1): >>hk__2+Ui9
◧◩◪◨⬒⬓⬔
17. hk__2+Ui9[view] [source] [discussion] 2023-08-18 07:55:23
>>xslowz+Bl7
> The delay was not on the HTML page.

Nobody is saying that.

[go to top]