Published on

Fundamental Principles of Proxy Server Operation

Authors

Fundamental Principles of Proxy Server Operation

This article explains what proxies can and cannot do for your anonymity. You will learn the difference between HTTP/SOCKS/transparent proxies, learn how to choose the right type for your task, and understand why just changing an IP is not even close to sufficient for true anonymity.

Introduction: The Principle of Intermediation in Computer Networks

To understand what a proxy server is, one must first consider the standard model of interaction on the Internet without its participation. This model is known as the "direct connection client-server architecture."

  1. Standard Model (without a proxy):
  • Your device (client), be it a computer, smartphone, or tablet, has a unique digital identifier — an IP address (Internet Protocol Address). This is your "network passport" that you present every time you go online.

  • When you type a website address (e.g., google.com) into your browser, your computer establishes a direct TCP connection with the server where that site is located.

  • Your browser sends an HTTP request to the target server. Important point: the IP address itself is not part of the HTTP headers. It is an element of a lower network layer (IP packet). However, a standard HTTP request contains headers such as Host, User-Agent, and, critically, when using a proxy, a service header X-Forwarded-For[1] is often added, which is precisely intended to convey the client's original IP address. When using HTTPS, the request content is encrypted, but the connection establishment information (IP addresses) remains visible at the network level.↙️

🔍 [1]X-Forwarded-For: This is a de facto standard header used by proxy servers and load balancers to convey the client's IP address. The first IP address in the list is usually the user's real IP address.

  • The server receives the request, processes it, and returns data back to your IP address, which the browser displays as a web page.

| ☠️ Problem |

The target server knows significantly more about you than it seems:

  1. Network and system data: It receives your real IP address (and consequently your approximate geographical location and internet provider), as well as detailed information about your software, extracted from the HTTP request headers (e.g., User-Agent indicates the operating system, browser, and its version).

  2. Digital Fingerprint (Browser Fingerprinting): This is a deeper and more insidious data collection technique. The server executes code on your browser's side (typically in JavaScript) that allows it to gather a unique "portrait":

    • Complete list of installed fonts (which can be obtained via CSS or Canvas API).
    • Screen dimensions and parameters (resolution, color depth, touch interface availability).
    • List of plugins and MIME types.
    • Temporal characteristics (graphics subsystem performance, clock frequency).
    • Connection data (list of time zones, language settings).
    • Data from browser local storage (Local Storage, Session Storage, Cookies). If you have visited the site before, it might have saved a unique identifier in your storage, which upon a revisit will uniquely link the new session to your old one, even if you changed your IP address.
    • EXIF data — this is a metadata standard (Exchangeable Image File Format) stored in digital photos and contains information about the shot and camera settings.

Key takeaway: A proxy server ONLY effectively masks your IP address and, to some extent, your provider's data, hahah. However, it does not protect against digital fingerprinting methods and tracking through browser storage. Countering these threats requires additional tools: disabling JavaScript, using specialized anti-fingerprinting browsers, etc.

⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣⭣

Why a Proxy is NOT Anonymity (and often the illusion of it):

1. The Proxy Provider Knows EVERYTHING.

Your traffic goes through the chain: You -> Your ISP -> Proxy Server -> Target Site. The owner of the proxy server sees:

  • Your real IP address.
  • All your unencrypted traffic (logins, passwords, browsing history).
  • Your connection times.
  • Metadata.

If it's a commercial or state-owned proxy, it keeps logs. You are simply swapping one "witness" (the target site) for another, often more centralized and dangerous one (the proxy provider).

2. DNS Leaks.

A classic mistake. You configured a proxy in your browser for HTTP/HTTPS traffic, but domain name resolution requests (e.g., converting google.com to an IP address) might bypass the proxy, going through your standard connection. In this case, your ISP and anyone listening on the network can see which sites you are visiting, even if the content itself loads through the proxy. SOCKS5 can proxy DNS, but this must be explicitly configured.

3. WebRTC Leaks.

This is a technology for video chats and P2P connections in the browser. Through a special JavaScript request, a site can force the browser to reveal your real, local (behind NAT)[2] and public IP address, completely ignoring the proxy settings. This can only be disabled manually in the browser settings.

🔒 Complete guide to protection against DNS and WebRTC leaks

[2] NAT (Network Address Translation) — is not a protocol or a service, but a mechanism used by routers to solve one of the biggest problems of the early internet.

Network address translation between a private network and the Internet

Briefly about the problem:

IPv4 addresses ran out. There are only about 4.3 billion of them, but there are many more devices.

NAT Solution: The router creates a private network (home, office) with "internal" addresses (e.g., 192.168.1.x). The router itself goes out to the internet with one "public" IP address.

NAT substitutes the addresses of devices from the private network with its own "public" IP, remembering which internal device the connection belongs to in order to return the response.

Simple analogy: It's like in an office of 100 employees, the secretary has one phone number. Everyone calls from the office through the secretary, and they know which employee to transfer an incoming call to.

4. Digital Fingerprint (Fingerprinting).

A site identifies you not by IP, but by a unique combination of:

  • Screen resolution, font list, plugins.
  • Browser and OS version.
  • Behavioral factors (typing rhythm, mouse movement). This "fingerprint" will be the same, whether through a proxy or without. The proxy is powerless here.

5. Behavioral Analysis.

Even if you logged in via a proxy, the analytics system on the site sees that "a user with IP 1.1.1.1 came from google.com via the query 'how to hack the pentagon', immediately logged into the account ivan_ivanov@mail.ru and started downloading instructions." The IP is different, but the person and their intentions are the same.

6. Cookies and Local Storage.

The browser, as it stored your cookies and data in Local Storage, continues to send them to the sites. The proxy again has nothing to do with it.

Conclusion:

"A proxy server changes the IP address visible to the target server. This creates a basic, primitive level of concealment, which is easily bypassed by modern tracking methods and is useless against targeted de-anonymization.

Effective anonymity is a set of measures:

  1. Trustless model: Using systems like Tor (where several independent nodes do not know the full route) or high-quality VPNs with a No-Logs policy (which you are forced to trust).

Briefly:

"No-Logs" VPN is not a myth, but requires verification. The key principle: you cannot hand over what you don't have (i.e., if you don't collect it in the first place.).

Where to look:

  1. Jurisdiction: Countries outside surveillance alliances (like the "14 Eyes"). Examples: Panama, British Virgin Islands, Seychelles.
  2. Practical verification: Trust only those whose "No-Logs" has been confirmed in court or by an audit.

Examples (verified):

What to check (checklist):

  • Jurisdiction
  • Confirmed cases of no logs
  • Independent audit
  • Technologies like RAM-only servers
  • Clean reputation

Summary: There is no absolute guarantee, but choosing the right VPN minimizes the risk, as the provider physically and legally cannot provide non-existent logs.

  1. Fighting fingerprinting: Using browsers like Tor Browser, which "pretend" to be the same for all users.
  2. Changing behavior: Refraining from logging into your accounts, using separate sessions, being cautious with JavaScript.

Using just a public proxy for anonymity is like hiding your face with a paper bag while leaving a tag with your name and address on your back. It solves one narrow task (changing IP), but does not make you anonymous."

Detailed Breakdown of the Waiter in a Restaurant Analogy

  • You (the client) — is your computer with a running browser.

  • The Waiter (the proxy) — is the proxy server with its own IP address.

  • The Kitchen (the web server) — is the target server, for example, Google's servers.

  • The Order (the HTTP request) — is a technically formed data packet containing the URL, method (GET, POST), headers, etc.

  • The Dish (the HTTP response) — is the HTML code of the page, images, CSS styles, and JavaScript files.

  1. The waiter can change the order. A proxy server is not always a "dumb" relay. It can modify the outgoing request. For example, it can:
  • Remove or change certain headers (e.g., Referer, which tells the server which page you came from).

  • Add its own headers.

  • Compress outgoing data to save traffic.

  • Block the request if it is directed to a prohibited resource (in corporate or ISP proxies).

  1. The waiter can give you something you didn't ask for. The proxy server might return not the current data from the server, but a cached copy it has, if it exists and hasn't expired. This is the very "traffic saving" that is rarely used on the client side today but is actively used on the ISP side to offload channels.

  2. Not all waiters are the same. The analogy does not account for fundamentally different types of proxies (HTTP, SOCKS, transparent), which operate at different layers of the OSI network model. A SOCKS proxy, for example, is not a "waiter" but more like a "courier service" that can deliver any cargo (any type of network traffic), not just "dishes from the restaurant" (web traffic).

Why is this needed? Detailed analysis of use cases

Using a proxy server is not a magical solution to all problems, but a specific tool for achieving certain network tasks.

1. Bypassing Blocks (IP-banning, geo-restrictions)

This is the most common and understandable use case.

  • Technical blocking mechanism: Most blocks on the internet are implemented based on IP addresses. A network administrator (in an office, school, country) or the owner of a web service (e.g., a streaming platform) compiles a blacklist of IP addresses or entire ranges. When a request comes from an IP address in this list, the server either does not respond or returns a blocking message.

  • How a proxy bypasses the block: Since your request comes from the proxy server's IP address, which is not on the blacklist, the block does not trigger. The request is processed successfully.

  • Important nuances:

  • IP address quality. If you use a public, free proxy, its IP address is highly likely already on the blacklists of many services, as it has been used for similar purposes before you.

  • Geolocation. To bypass geo-restrictions (e.g., to watch a service available only in the USA), you must use a proxy server with an IP address located in the required country. The service determines your "location" by the geolocation of the proxy's IP address.

2. Anonymity: Correcting the Understanding of the Term

Critically important: A proxy server does not make you an "invisible man" on the internet in an absolute sense. It provides a certain level of anonymity, which depends on its type and configuration.

  • What is hidden: Your real IP address from the target web server. This is the main thing.

  • What is NOT hidden or can be revealed:

  • The proxy server provider knows your real IP address, as you are the one establishing the connection with it.

  • If the connection between you and the proxy server is not encrypted (HTTPS or SOCKS5 with encryption is not used), then your internet provider or anyone eavesdropping on your network (e.g., on public Wi-Fi) can see that you are using a proxy and potentially intercept your unencrypted traffic.

  • There are different levels of proxy anonymity. Some types (transparent) do not hide the fact of using a proxy or your IP at all, while others (anonymous) hide the IP but do not hide the fact of using a proxy. Only elite (high anonymous) proxies do not inform the target server of their intermediation.

  • Your identity can be de-anonymized by behavioral factors ("digital fingerprint") — browser plugins used, screen resolution, fonts, even typing rhythm. A proxy does not protect against this.

Conclusion: A proxy is a tool for hiding an IP address from the end server, not a panacea for complete anonymity. For the latter, more complex systems like Tor are required.

3. Security: An Additional Barrier with Caveats

Using a proxy can enhance security, but this is not its primary or most reliable function.

  • Content filtering: Corporate or ISP proxies are often used to block access to malicious sites. The proxy server can check the requested URL against a database of phishing and malware resources and block the connection before it reaches your device.

  • Virus scanning: Some proxies can check downloaded files (e.g., .exe, .zip) using antivirus engines.

  • Limiting risks on public Wi-Fi: When connected through a proxy, all your web traffic is redirected through it. If an attacker on the same public network tries to redirect you to a fake website, a filtering proxy can prevent this.

  • Caution: The proxy server itself represents a point of vulnerability. If you use an unreliable (especially free) proxy, its owner has the technical ability to intercept and analyze all your unencrypted traffic, including logins and passwords. Therefore, you should only trust verified providers, and it is always preferable to use encryption (HTTPS) over the proxy.

4. Data Parsing (Web Scraping)

This is a professional task where the proxy ceases to be a tool for the average user and becomes a critically important infrastructure component.

  • The problem: When automatically collecting data from sites, you send a large number of requests from one IP address in a short period of time. The site's protection systems (e.g., Cloudflare) easily detect such behavior as non-human and block the IP address on suspicion of a DDoS attack or scraping.

  • The solution: Using a pool (rotator) of proxy servers. The parsing program sends requests in turn through different proxies, thus distributing the load across hundreds or thousands of different IP addresses. For the target server, this looks like normal traffic from different devices around the world, which does not raise suspicions.

  • Requirements for parsing proxies: High speed, reliability, a large number of IP addresses in the pool, and often residential IP addresses (the IP type will be covered in part 2) are required, which are harder to block.

5. SEO and SMM (Managing Multiple Accounts)

The principle here is similar to parsing, but applied in marketing.

  • The problem: Social media platforms (Instagram, Facebook, TikTok) and search engines (Google) strictly monitor account behavior. If several accounts log in from the same IP address, the system algorithms can link them together ("link by IP"). If one account is banned, there is a high probability of blocking all accounts linked to it.

  • The solution: Each account or group of accounts is assigned its own unique proxy server with a permanent (static) IP address. Thus, for the platform, each account exists in its own isolated network environment, as if it were managed from different apartments, offices, or cities. This significantly reduces the risk of mass blocking.

6. Traffic Saving (Caching) — Historical Context

This is a function that has lost its relevance for the end user but remains important at the infrastructure level.

  • How it worked: The proxy server saved (cached) copies of frequently requested resources: images, CSS files, even entire HTML pages. When the next user on the same network (e.g., a company employee) requested the same resource, the proxy server delivered the data from its cache without accessing the external internet. This saved external bandwidth and sped up loading.

  • Why it's rare now:

  1. Encryption (HTTPS everywhere). The modern internet almost entirely operates over HTTPS. This means the connection between the browser and the server is encrypted. The proxy server cannot decrypt this traffic and, consequently, cannot analyze and cache its content. It can only cache unencrypted HTTP traffic, the share of which is negligible.

  2. Dynamic content. Web pages have become dynamic and personalized. Social feeds, news streams, advertising — the content on the same URL is different for different users and at different times. Such content is pointless to cache.

  3. CDN (Content Delivery Network). The caching function has been taken over by global content delivery networks (Cloudflare, Akamai, etc.), which place copies of content on servers worldwide, close to users.

Today, caching proxies are used mainly by internet providers to reduce load on backbone channels (caching popular video from YouTube) and in large corporate networks for caching operating system and program updates.

The Most Important Proxy Types: Classification by Protocol and Operation Level

Classifying proxies by protocol is the foundation, as it determines what specific network traffic the proxy can work with.


1. HTTP(S) Proxy — Specialized Web Inspectors

  • Operation Level: Application Layer of the OSI/TCP model. The key feature is that the proxy understands the structure and semantics of the HTTP protocol.
  • Principle of Operation: Deep understanding of HTTP allows such a proxy to:
    • Analyze and modify HTTP headers (add, remove, change).
    • Cache content — save copies of frequently requested resources to speed up access.
    • Filter traffic based on URL, MIME types, or content.
    • Require authentication from the user (login/password).
  • Critical Difference between HTTP and HTTPS:
    • HTTP proxy: Works with unencrypted traffic. Sees everything: full URLs, sent data (logins, passwords), page contents. It is a "Man-in-the-Middle" by nature.
    • HTTPS proxy (CONNECT mode): To protect content from being viewed, the CONNECT command is used. The proxy establishes an end-to-end TLS tunnel between the client and the server. Important: In this mode, the proxy cannot decrypt or modify the transmitted data. It only relays encrypted bytes, acting as a "dumb pipe." It only sees the hostname of the target server (specified in CONNECT), but not the full URL-path or data inside the session.
  • Hard Limitations: Work EXCLUSIVELY with HTTP/HTTPS traffic. Not suitable for FTP, game servers, VoIP, or torrents (with rare exceptions when the client can only proxy HTTP requests to trackers).
  • Scope of Application: Web browsers and applications using HTTP/HTTPS APIs (REST, GraphQL clients).

2. SOCKS Proxy — Universal Transport Relays

  • Operation Level: Session Layer or even between Session and Transport Layers. SOCKS operates below HTTP proxies and is abstracted from application layer protocols. For it, the transmitted data is an opaque byte stream.
  • Principle of Operation: A SOCKS proxy redirects TCP/UDP packets without delving into their content. It does not analyze headers, cache, or filter content. Its task is to create a "transport corridor."
  • Version Evolution:
    • SOCKS4: Outdated standard. Supports only TCP, authentication is based on the client's IP address.
    • SOCKS5: Modern standard. Key improvements:
      • UDP support: Indispensable for DNS queries, VoIP (Zoom, Telegram), online games.
      • Flexible authentication: Methods LOGIN/PASSWORD and NO AUTHENTICATION.
      • IPv6 support.
      • Remote DNS resolution (Remote DNS): Critically important option. When activated on the client, DNS queries are proxied through the SOCKS connection, preventing leaks. Without it, DNS queries bypass the proxy, revealing browsing history.
  • Scope of Application: Any applications not limited to web protocols: torrent clients, online games, messengers, SSH and FTP clients.

Excellent! Let's break down RFC 1928 — SOCKS Protocol Version 5 meticulously. This is not just a dry standard, but an elegant engineering solution that remains relevant almost 30 years later.

Architecture and Philosophy of SOCKS5 (RFC 1928)

Key Idea: SOCKS5 is a "shim-layer" between the Application (L7) and Transport (L4) layers, providing transparent traffic tunneling through firewalls.

[Application][SOCKS-client][SOCKS-server][Target Server]

Connection Establishment Process (3 Phases)

Phase 1: Greeting and Authentication Method Selection

The client connects to TCP port 1080 and sends:

+----+----------+----------+
|VER | NMETHODS | METHODS  |
+----+----------+----------+
| 0x05|    1     | 1 to 255 |
+----+----------+----------+

Breakdown:

  • VER = 0x05 — SOCKS5 version
  • NMETHODS — number of supported authentication methods
  • METHODS — list of methods:
0x00 - NO AUTHENTICATION REQUIRED
0x01 - GSSAPI (Kerberos)
0x02 - USERNAME/PASSWORD
0x03-0x7F - IANA ASSIGNED
0x80-0xFE - PRIVATE METHODS
0xFF - NO ACCEPTABLE METHODS

Server responds:

+----+--------+
|VER | METHOD |
+----+--------+
| 0x05| chosen |
+----+--------+

If the server returns 0xFF, the connection is terminated.

Phase 2: Authentication (method-dependent)

For method 0x02 (USERNAME/PASSWORD) — RFC 1929:

+----+-----+-------+------+----------+------+----------+
|VER | ULEN | UNAME | PLEN | PASSWD | VER | STATUS |
+----+-----+-------+------+----------+------+----------+
| 0x01|  1   |  var  |  1   |   var   | 0x01 |    1     |
+----+-----+-------+------+----------+------+----------+

Statuses: 0x00 — success, 0x01 — failure

Phase 3: Sending the Request

After successful authentication, the client sends the main request:

+----+-----+-------+------+----------+----------+
|VER | CMD |  RSV  | ATYP | DST.ADDR | DST.PORT |
+----+-----+-------+------+----------+----------+
| 1  |  1  | 0x00  |  1   | Variable |    2     |
+----+-----+-------+------+----------+----------+

Request Types (CMD)

CONNECT (0x01) — establish a TCP connection to the target server BIND (0x02) — reverse connection (for FTP, P2P) UDP ASSOCIATE (0x03) — work with UDP traffic

🌐 Address Types (ATYP)

0x01 - IPv4 (4 bytes)
0x03 - DOMAINNAME (1 byte length + string without NULL)
0x04 - IPv6 (16 bytes)

Important: For domain names, the first byte contains the name length, followed by the string WITHOUT a null terminator.

Server Response

+----+-----+-------+------+----------+----------+
|VER | REP |  RSV  | ATYP | BND.ADDR | BND.PORT |
+----+-----+-------+------+----------+----------+
| 1  |  1  | 0x00  |  1   | Variable |    2     |
+----+-----+-------+------+----------+----------+

Response Codes (REP)

0x00 - succeeded
0x01 - general SOCKS server failure
0x02 - connection not allowed by ruleset
0x03 - Network unreachable
0x04 - Host unreachable
0x05 - Connection refused
0x06 - TTL expired
0x07 - Command not supported
0x08 - Address type not supported

Special Scenarios

BIND Request (for FTP)

  1. Client sends a BIND request
  2. Server creates a socket and sends the first response with BND.ADDR/BND.PORT
  3. Client communicates this data to the FTP server
  4. When the FTP server connects, the SOCKS server sends a second response

UDP ASSOCIATE

  • Creates an association for UDP traffic
  • Client must send UDP packets to BND.ADDR:BND.PORT
  • Each datagram has a header:
+----+------+------+----------+----------+----------+
|RSV | FRAG | ATYP | DST.ADDR | DST.PORT |   DATA   |
+----+------+------+----------+----------+----------+
| 2  |  1   |  1   | Variable |    2     | Variable |
+----+------+------+----------+----------+----------+

💡 Critically Important Features

1. DNS Support at the Protocol Level

SOCKS5 can resolve domain names on the server side (ATYP=0x03), which prevents DNS leaks.

2. UDP Fragmentation

Supports fragmentation via the FRAG field:

  • 0x00 — standalone datagram
  • 1-127 — fragment position
  • High bit = 1 — last fragment

3. Security through Authentication

Unlike SOCKS4, it has a built-in authentication system.

Practical Significance Today

  1. Bypassing blocks — the basis of most proxy services
  2. Tor Network — uses SOCKS5 as an interface for applications
  3. Mobile applications — many VPN applications use SOCKS5
  4. Data parsing — IP rotation via SOCKS5 proxies

Limitations and Caveats

  • No encryption by default — traffic between client and SOCKS server is not encrypted
  • WebRTC leaks — SOCKS5 does not protect against WebRTC leaks
  • Requires client-side support — the application must be SOCKS-aware

Conclusion

SOCKS5 is an elegant, minimalist protocol that solves a specific task: transparently tunneling traffic through an intermediate node. Its strength lies in its simplicity and extensibility, which explains its longevity in the ever-changing world of network technologies.

The protocol is perfectly suited for situations where you need to change the source of an outgoing connection without modifying the application itself at the protocol level.

3. Transparent Proxies — The Coercive Network Overseer

This is not a protocol, but an implementation method, invisible to the end user.

  • Principle of Operation: Proxy settings are not configured in the OS or applications. The network gateway (router, firewall) using mechanisms like NAT (Destination NAT) or Policy-Based Routing (PBR) automatically redirects all outgoing traffic (ports 80, 443) to an internal proxy server.
  • What the end server sees: The configuration varies, but in the classic case, the proxy, while remaining "invisible" to the client, adds a special header (most often X-Forwarded-For) to the HTTP request, in which it passes the user's real IP address. Thus, the server knows both the fact of proxying and the original client IP.
  • Where it's found: Corporate and ISP networks, public Wi-Fi (airports, hotels). Goals — caching, content filtering, monitoring.
  • Hard Verdict: Absolutely useless for anonymity and bypassing geographical blocks, as it does not hide, and often directly reveals your IP address. Its purpose is control, not freedom.

🎯 Final Conclusion

A proxy is a tool for changing your IP, not for anonymity. It solves specific problems: bypassing blocks, parsing, and account management. True anonymity requires a combination of measures: Tor, anti-tracking browsers, and behavior modification (and a whole bunch of other things). Choose the proxy type based on your specific needs—there are no one-size-fits-all solutions.