Cybercrime in the Deep Web Dr. Marco Balduzzi HackInBo, 14th May 2016

embyte:~$ whoami ◎ Underground and ‘hackish’ subculture since the early 2000s ◎ M.Sc. + Ph.D. in System Security ◎ Turned hobby into profession ◎ Sr. Research Scientist at Trend Micro ◎ Bridge scientific research and industry needs

◎ Veteran speakers in major conferences and wide presence in review boards 2

“ The Deep Web

3

Deep, dark, what? ◎ Deep Web: the Internet not indexed by traditional search engines (e.g., private forums) ◎ Dark Net: Private overlay network (e.g., TOR) ◎ Dark Web: WWW hosted on Dark Nets

4

TOR ◎ ◎ ◎ ◎

First alpha in 2002 Initially used to browse anonymously the Surface Web Hidden services -> effective Dark Web Onion routing: multihop routing with with host key encryption.

5

I2P ◎ First beta in 2003 ◎ Full Dark Net, no anonymous browsing to the Surface Web ◎ Garlic routing: multiple encrypted tunnels, multiple layers of encryption (transport, tunnel, path)

6

Freenet ◎ Oldest one: summer of 1999 (father of I2P) ◎ Content distribution and discovery, no service hosting ◎ Gossip protocol to lookup a resource (i.e. web page)

7

Namecoins, Emercoins ◎Blockchain-based domain name server ◎Think bitcoins, but instead of payment transactions, DNS registrar transaction

◎Distributed ◎Decentralised ◎No regulating institution

8

RogueTLDs & PrivateDNSes Plain old DNS, but with custom servers

Custom registrars

Custom domains

9

“ DeWA (Deep Web Analyzer)

10

System Overview

11

Data Sources

User transactions

Pastebin-like sites

Twitter 1% feed

Reddit

URL listing sites

TOR gateways

I2P host files

Scouting feedback

12

Page Scouting HAR Log Bitcoin Wallets

Page DOM

Screen shot

Email

Headless browser Links

Title

Raw HTML

Text Metadata

13

Headless Browser ◎ Scrapinghub's Splash ◎ ◎ ◎ ◎

QTWebkit browser Dockerized LUA scriptable Full HTTP traces

◎ Crawler based on Python's Scrapy + multiprocess + Splash access ◎ Headers rewrite ◎ Shared queue support ◎ Har log -> HTTP redirection chain

◎ Extract links, emails, bitcoin wallets 14

Data Enrichment

Links classification

Page translation

Significant wordcloud

• Surface Web links • Classification and categorisation

• Language detection • Non-English to English

• Semantic clustering • Custom algorithm

15

Example: Russian Forum

16

Collected Data ◎Running since Nov. 2013 ◎42.5 M Events ◎624,000 URLs ◎35,500 domains

17

“ Illegal Trading

18

Drugs! Drugs! Drugs!

19

Guns

20

Passports and Fake IDs

21

Counterfeit Money

22

Credit Cards ◎ Higher balance = higher price

23

Paypal & Ebay Stolen Accounts

24

Doxing

25

Assassins

26

“ Data Analysis

27

Protocols (HTTP/S+) ◎ By publicly sourced URLs

7

17

172 28

Active Portscan IRC

IRCS

SSH

49

31

855

* We are based on anarchistic control so nobody haz power certainly not power over the servers or * domains who ever says that this or that person haz power here, are trolls and mostly agents of factions * that haz butthurt about the concept or praxis where the CyberGuerrilla Anonymous Nexus stands for. #freeanons 15 [+Cnt] This channel is created to support arrested Anons and act with solidarity in Anons. No MoneyFags, No Famefags, No PowerManiacs, No LeaderFags! Another Anons was arrested in France: http://www.ladepeche. fr/article/2015/10/10/2194982-enquete-de-la-dgsi-sur-du-piratage-informatique.html

29

Languages per domain

30

Languages per domain (2)

31

French forum: Weapon sale http://wyzn2fvcztadictl.onion:80/viewtopic.php?pid=16452

32

Pages Embedding Suspicious Links

33

Email Identification

34

bankofamerica@mail2tor

35

Exilio forum 1/2

http://ogatl57cbva6tncg.onion:80/index.php ?t=msg&th=833&goto=4445&#msg_4445

36

Exilio forum 2/2

37

Automated Bitcoin Identification

1200+ bitcoin wallets found in our data (not counting the obfuscated ones)

38

Bitcoin Tumblers http://tumbly5lisxnjozd.onion:80/

39

Bitcoin Multiplier 1/2 http://tfsux6hiihj7qvxh.onion:80/

40

Bitcoin Multiplier 2/2

41

“ Malware

42

Malware: Its adoption in the Deep Web ◎ Modern malware is network-dependent ◎ @ infection-time: Exploit kits ◎ @ propagation-time: 2nd stage malware ◎ @ operational-time: C&C servers

◎ Goals : ◎ Make botnets resilient against LEA operations, e.g. takedowns ◎ Conceal payment pages ◎ Untraceable money transfers

◎ Additional readings: ◎ Brown in Defcon 18 ◎ Hunting Down Malware on the Deep Web (infosec institute) 43

SkyNet ◎ Malware with DDoS, bitcoin mining and banking capabilities (©G-Data/Rapid7) ◎ ZeuS bot ◎ Bitcoin mining tool (CGMiner) ◎ GPU libraries for hash cracking

◎ TOR client per Windows ◎ Use /gate.php as landing page to store the harvested credentials ◎ Path monitoring ….

44

SkyNet: Dynamic TOR-based C&Cs

45

Dyre Banking Trojan ◎ BHO that MiTMs online-banking pages at browser-level ◎ Back-connects from victim to attacker (kind-of reverse-shell approach) ◎ DGA generation of C&C domains on Clearnet ◎ Use I2P as backup option (:80/443) ◎ ◎ ◎

nhgyzrn2p2gejk57wveao5kxa7b3nhtc4saoonjpsy65mapycaua.b32.i2p (already known to SecureWorks on 17 December 2014) oguws7cr5xvl5jlrhyxjktcdi2d7k5cqeulu4mdl75xxfwmhgnsq.b32.i2p 4nhgyzrn2p2gejk57wveao5kxa7b3nhtc4saoonjpsy65mapycaua.b32.i2p

46

Dyre’s Infection Evolution

47

Vawtrack Banking Trojan ◎ Spreads via phishing emails ◎ C&C servers (IPs) are retrieved by downloading the ‘favicon.ico’ icon-file from websites hosted on the TOR network ◎ IPs are steganographically hidden

48

Vawtrack Banking Trojan (cont.) ◎ Runs ‘openresty/1.7.2.1’ as web-server ◎ Return code on ‘favicon.ico’ is 403 Forbidden

◎ `ws=‘openresty\1.7.2.1’ && ∃(‘favicon.ico’) && retcode=403` returns a list of 23:

49

Vawtrack Banking Trojan (cont.)

50

Ransomware ◎ Ransomware seem to love the Deep Web ◎ It provides a hidden and robust “framework” for cashouts and illicit money transfers

51

TorrentLocker ◎ A variant of cryptolocker ◎ Payment page hosted in the Deep Web ◎ Cashout via Bitcoins

52

TorrentLocker (cont.) ◎ Malware generates univocal IDs ◎ ◎

wzaxcyqroduouk5n.onion/axdf84v.php/ user_code=qz1n2i&user_pass=9019 wzaxcyqroduouk5n.onion/o2xd3x.php/user_code=8llak0&user_pass=6775

◎ Tracking on specific query string’s parameters ◎

path=’/[a-z0-9]{6}.php/user_code=[a-z0-9]{6}&user_pass=[0-9]{4}’

53

Breakdown by victims and country

54

NionSpy ◎ Steals confidential information like keystrokes, passwords and private documents ◎ Records video and audio, suitable for espionage programs ◎ Detection Feature: ◎ Popularity in the number of values associated to parameters (in the query string)

55

Automated Detection

56

NionSpy: GET’s query string analysis ◎ xu experienced a quick surge in popularity: 1700+ values ◎ si.php?xu=%e0%ee%a8%e5%f2%e9%e5%e4%f2[...]

◎ URL-encoded binary blob representing the leaked data ◎ si.php?xd={“f155”:“MACHINE_IP”, “f4336”: “MACHINE_NAME”,“f7035”:“5.9.1.1”,“f1121”: “windows”,“f2015”:“1”}

◎ Reports a new infection 57

NionSpy: New victims and leakages ◎ Blue (xd): # of new victims / day ◎ Green (xu): amount of leaked information (bytes)

58

Thank you! Dr. Marco Balduzzi @embyte

59

Cybercrime in the Deep Web - GitHub

May 14, 2016 - We are based on anarchistic control so nobody haz power certainly not power over the servers or. * - domains who ever says that this or that person haz power here, are trolls and mostly agents of factions. * - that haz butthurt about the concept or praxis where the CyberGuerrilla. Anonymous Nexus stands ...

8MB Sizes 1 Downloads 305 Views

Recommend Documents

Deep Learning - GitHub
2.12 Example: Principal Components Analysis . . . . . . . . . . . . . 48. 3 Probability and .... 11.3 Determining Whether to Gather More Data . . . . . . . . . . . . 426.

Deep Gaussian Processes - GitHub
Because the log-normal distribution is heavy-tailed and its domain is bounded .... of layers as long as D > 100. ..... Deep learning via Hessian-free optimization.

Deep Learning with H2O.pdf - GitHub
best-in-class algorithms such as Random Forest, Gradient Boosting and Deep Learning at scale. .... elegant web interface or fully scriptable R API from H2O CRAN package. · grid search for .... takes to cut the learning rate in half (e.g., 10−6 mea

Entity Recommendations in Web Search - GitHub
These queries name an entity by one of its names and might contain additional .... Our ontology was developed over 2 years by the Ya- ... It consists of 250 classes of entities ..... The trade-off between coverage and CTR is important as these ...

Google's Deep-Web Crawl - Washington
Aug 30, 2008 - results on Google.com, while even the top 100, 000 forms only accounted ... are extremely common in web forms across many domains. Hence ...

web based - GitHub
I am nota developer! Hello, I'm Emil Öberg,. I am not a developer. ... Page 6 ... iOS old. iOS 8. Android old. Android 5. PLZ give tab bar ...

Programming Mobile Web - GitHub
Wordpress. Theme. Plugin. Joomla. Theme. Add on. Drupal. Module. Theme. More … Forum. Vanilla. esoTalk. Phpbb. More … More … Web server. Apache.

An Exploration of Deep Learning in Content-Based Music ... - GitHub
Apr 20, 2015 - 10. Chord comparison functions and examples in mir_eval. 125. 11 ..... Chapter VII documents the software contributions resulting from this study, ...... of such high-performing systems, companies like Google, Facebook, ...

Google's Deep-Web Crawl - Washington
Aug 30, 2008 - The Deep Web, i.e., content hidden behind HTML forms, .... 2. HTML FORM PROCESSING. An HTML form is defined within a form tag (example ...

McLab tools on the web - GitHub
highlighting. ➔ Message terminal. ➔ API for code highlighting using analysis results ... React.js. UI library built by Facebook https://facebook.github.io/react/ ...

A Snapshot of the OWL Web - GitHub
OWL ontologies are used across a wide spectrum of domains, ranging from chemistry to bio-health ..... File name and file size patterns First, a random sample of 100 ontologies was repeatedly drawn from ... to be largely cluster-free. In order to ...

McLab tools on the web - GitHub
Jan 6, 2016 - tools developed under the McLab project. This application is explicitly .... library developed by Facebook[5], and the Flux architecture pattern that complements React's composable. 4 ... Another option is to instead of selectively chan

Learn to Write the Realtime Web - GitHub
multiplayer game demo to show offto the company again in another tech talk. ... the native web server I showed, but comes with a lot of powerful features .... bar(10); bar has access to x local argument variable, tmp locally declared variable ..... T

image compression using deep autoencoder - GitHub
Deep Autoencoder neural network trains on a large set of images to figure out similarities .... 2.1.3 Representing and generalizing nonlinear structure in data .

HOW TO WEB DEV - GitHub
What devices will the application run on? • Will my application need a ... Java. Code. Java. Compiler. Android. App ... HTML is not a programming language.

Accessing the Deep Web: A Survey
With its myriad databases and hidden content, this deep Web is an important yet ... hosts within an IP is rather difficult to conduct in practice, we do not consider such ... To access a Web database, we must first find its entrances– i.e., query.

The Web Browser Personalization with the Client Side ... - GitHub
Thanks to social network services, our daily lives became more ... vices is to exchange them with one of authorization protocols such as OAuth 4. Through a ...

Grails In The Enterprise - GitHub
Developer Program portals for all kinds of companies, ... Employer had very antiquated Struts/OJB application. ➢ ..... Blog: http://rvanderwerf.blogspot.com.

Web Interface Integrating Jeopardy Database - GitHub
Page 1. Web Interface Integrating Jeopardy Database. School of Information, The University of Texas at Austin. Anuparna Banerjee, Lindsay Woodward, Kerry Sim. ○

ROLLING IN THE DEEP - STUDENTS.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. ROLLING IN ...