Introduction to curl
curl is a free and open source software for transferring data from and to a server.
It is a simple and robust command line tool.
It supports many protocols including HTTP. It is quite easy to use.
For example, you can get an HTML page just by typing the URL of the webpage next to curl command.
curl example.com
You will see the page source of the web page. note that curl does not have an HTML parser as the browser does.
curl takes HTTP by default. if you are going to curl to an https website, you have to specify complete URL.
curl https://www.facebook.com
You can send a Get request with curl
curl http://www.example.com/login?name=ryan
You can send a Post request with curl
curl –data "name=ryan" http://www.example.com/login
I assume you have the basic knowledge of curl.
HTTP Redirects
HTTP, HyperText Transfer Protocol, is an application-level protocol for data communication in World wide web.
It is the most used protocol on the internet for transferring data. You saw above, how you can get a web page using HTTP protocol with the help of curl.
HTTP is a protocol with so many features and one of its main features is `redirects`. It is one of the fundamental concepts in HTTP. HTTP represents redirects by 3** status codes.
There are several types of redirects available to you including 301, 302.
Redirects work exactly how its name suggests.
Sometimes, when you send a request to the server asking for a webpage you will get instruction from the server instead of the requested web page.
This instruction will tell you to look over here for the requested web page.
Let’s check this example.
curl google.com
when you curl to google.com you will get following response. Note that you are calling to http://www.google.com here.
Output
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>301 Moved</title> </head> <body> <h1>301 Moved</h1> The document has moved <a href="http://www.google.com/">here</a> </body> </html>
The server tells you, this document is moved to a new location. It has sent you the address of the new location. Status code for this redirect is 301. The meaning of 301 is it is a permanent move. In contrast, 302 means the move is temporary.
So, to get to the google.com, you have to do another curl to the specified address.
Why don’t you see such behavior from a browser?
Because browsers do HTTP redirection for you automatically.
curl, by default, do not support redirection. But with an extra argument, you can instruct curl to follow redirects.
There are two options for this. You can use -L
either or –location.
curl -L google.com
now you will get the huge page source of the google.com which I don’t want to put here.
Temporary and Permanent Redirects
Redirects are all not same. There are several things that we have to consider. We already got to know that some redirects are permanent while others are temporary.
What does this mean that HTTP expects the user agent (browser/curl) should cache the redirected URI if it is a permanent redirect and from next time onwards user agent should directly go to the redirected URI.
If it is a temporary redirect, the user agent should not cache the redirected URI and keep trying the original URI in every subsequent request.
Browsers have this capability but curl does not have it. Hence there is no difference between temporary and permanent redirects for curl.
Number of Redirects
Sometimes, redirection happens in a loop. First address will be redirected to another address. It will redirect to another address.
When `redirects` is enabled, curl will allow being redirected up to 50 times. The limit was set to avoid getting caught in endless loops.
Let’s say A redirect to B and B redirect to A.
this will cause an endless loop.
Such a situation can occur either by mistake of someone or someone’s malicious intention. If you want to increase the number of redirects to be followed, you can do so with the –max-redirs
option.
curl -L –max-redirs 10 example.com
Redirection methods- GET/ POST
You can follow redirects with a Get request like below.
curl -L http://www.example.com/login?name=ryan
You can follow redirects with a Post request like below
curl -L –data "name=ryan" http://www.example.com/login
There are few things to know about HTTP redirect methods.
If we consider 301 and 302 redirections, both of them treats redirect requests as GET methods. That means, even you send a post request to original request, redirect request will be a GET one.
On the other hand, 307 and 308 will keep the original method for redirection.
Curl follow these standards without any problem. But there are a lot of web services available on the internet who uses 301 and 302 redirects, yet want both the original and redirected request to be a post one.
You can do it with curl using –post301 and –post302 options.
Redirect to a different host
Redirection can occur within the same server or between different servers. For example, a website can be moved to a new hosting provider.
curl behave differently when it is redirected to other hosts by limiting what data it sends.
So, if you want to provide credentials like usernames and passwords and fully trust the redirected server, you should tell it to curl by calling –location-trusted option.
Conclusion
cURL provides many options to interact with servers for transferring data. You can use many of those options with curl redirect option as well. Also, note that curl does the redirection from the server end. You will find very difficult to implement client-based redirections such as HTML redirects with curl as curl never parses the HTML sources. Let me know if you have any question about cURL follow Redirect.
Thank you the redirection -L was exactly what i was looking for.