Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.

For the best experience please use the latest Chrome, Safari or Firefox browser.

What is CURL?


[copy right: Uunder a MIT/X derivate license]


curl is a tool to transfer data from or to a server, using one of the supported protocols.



curl is a command line tool for transferring data with URL syntax.

supporting

DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet....

Tools To Use Instead of curl:

wget, ncat, ...


Tools That Help You use curl:

LiveHTTPHeaders, Wireshark, tcpdump, ssldump, ...


Companies Using curl in Commercial Environments:

Adobe, google, Cern, Cisco, F-secure, IBM, ...


Programs using curl for file transfers:

mcurl, leech, ...

What is LIBCURL?
[copy right: Uunder a MIT/X derivate license]



libcurl is a free and easy to use client side URL transfer library.


libcurl is highly portable, it builds and works identically on numerous platforms.


libcurl is free, thread safe, IPv6 compatible, feature rich, well supported, fast, thoroughly documented.

42 Bindings

Bacic, C++, D, Falcon, Java, Lua, PHP, Haskell, Perl, Python, Ruby, SPL, Postgres, ...

wget website grabber(recursive):


			$ wget \
				 --recursive \
				 --no-clobber \
				 --page-requisites \
				 --html-extension \
				 --convert-links \
				 --restrict-file-names=windows \
				 --domains backtrack-linux.org \
				 --no-parent \
					 http://www.backtrack-linux.org/wiki/index.php/Main_Page

		
Introducing PHP/CURL

Multiple Transfer Protocols
Form Submission
Basic Authentication
Cookies
Redirection
Agent Name Spoofing
Referer Management
Socket Management

The Problem with Browsers ?

  • * Browsers cannot aggregate and filter information for relevance.
  • * Browsers cannot interpret what they find online.
  • * Browsers cannot act on your behalf.

web bot samples

  • * Webbots That Aggregate and Filter Information for Relevance.
  • * Webbots That Interpret What They Find Online.
  • * Webbots That Act on Your Behalf(Pokerbots).

challengs

  • Making Data Smaller
  • Form submition
  • Long live process
  • Memory Management
  • Fault Tolerance
  • Spider Trap
  • Syste Log

curl_getinfo

			curl_getinfo($ch)['http_code']
			curl_getinfo($ch)['content_type']
			curl_getinfo($ch)['request_size']
			curl_getinfo($ch)['certinfo']
			curl_getinfo($ch)['redirect_url']
			

sniper:

Procurement Bot

Scaling:

  • One-to-many
  • One-to-one
  • Many-to-many
  • Many-to-one

Multiple Instances of a Webbot:

  • Additional processes
  • Fork current process
  • Multiple pieces of hardware

Stealth:

Simulating Human Patterns:
  • Be Kind to Your Resources
  • Run During Busy Hours
  • Use Random Delays

Killing web bot:

  • Use the robots.txt File
  • Use the Robots Meta Tag
  • Use Obfuscation
  • Use Cookies, Encryption, JavaScript, and Redirection
  • Embed Text in Other Media
  • Spider Trap
Respect to robots.txt and robots meta tag, yes or no???
BOT Net

Use a spacebar or arrow keys to navigate