Stanford CS144 Computer Networks Podcast: The Bottom Layer of Computer Networks - From the Physical Layer to the Link Layer

2025.09.01

As we enjoy smooth web browsing and high-definition video calls, we rarely consider how the data that makes up our digital lives  0 is  1 precisely delivered across thousands of kilometers from the server's network interface card to the Wi-Fi chip in our phone. All this magic begins at the very bottom of the network protocol stack—the physical and link layers.

In the famous "hourglass" model of the internet, the IP protocol is the "waist" connecting the upper and lower layers. Beneath IP lies a diverse world, carrying various technologies such as Ethernet, Wi-Fi, and DSL. This article will begin with the most basic physical signals and gradually demystify the physical and link layers.

Physical layer: converts bit stream into analog signal

The physical layer is the cornerstone of network communications. Its core task is to convert the digital bit stream transmitted by the upper layer into physical signals that can propagate through cables, optical fibers, or air. This process may seem simple, but it is full of intricate designs and trade-offs.

Elastic Buffer: An Elegant Solution to Clock Asynchrony

You might assume that the clocks of the sender and receiver in network communications should be perfectly synchronized. However, the reality is that it's nearly impossible to create two clock crystals with exactly the same frequency. There will always be tiny differences between them, perhaps as small as a few parts-per-million (ppm) tolerance.

This means that there will always be a slight difference between the rate at which the sender sends bits and the rate at which the receiver receives bits. If the sender is slightly faster, the receiver's buffer will eventually  overflow  ; conversely, if the receiver is slightly faster, the buffer will  underflow  .

To solve this problem, the physical layer introduces  an elastic buffer  . You can think of it as a first-in-first-out queue (FIFO), which cleverly reconciles the rate difference between the sending and receiving ends.

写入指针 (由恢复的发送方时钟驱动)
      |
      v
[ | | | | | | | | | | ]  <-- 缓冲区
      ^
      |
  读取指针 (由接收方本地时钟驱动)

状态 1: 发送方稍快, 写入指针追赶读取指针
[ | | | |X|X|X|X| | | ]

状态 2: 接收方稍快, 读取指针追赶写入指针
[ | |X|X| | | | | | | ]
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.

To ensure that this buffer neither overflows nor underflows, network protocols have designed two key mechanisms:

  1. Maximum Transmission Unit (MTU)  : This limits the maximum length of a single packet. This ensures that even in the worst case (fastest sender, slowest receiver), a single packet won't fill up the buffer.
  2. Inter-packet gap (IPG)  : The sender must take a short "rest" period after sending a packet. This gap gives the receiver enough time to clear its buffer and return it to a safe intermediate state to prepare for the next packet.

Therefore, the more accurate the clock (the smaller the tolerance), the smaller the required elastic buffer and inter-frame gap. In an ideal world, if the clocks were perfectly synchronized, both of these could be zero.

Shannon Limit and Modulation: The Ceiling of Channel Capacity

How fast can a network cable run? The theoretical upper limit to this question was provided by Claude Shannon, the father of information theory.  The Shannon Capacity  Theorem states that the maximum data transmission rate of a channel under error-free conditions is determined by the following formula:

in:

  • C It is the channel capacity (Capacity), the unit is bits per second (bps).
  • B It is the bandwidth of the channel (Bandwidth), the unit is Hertz (Hz).
  • S/N It is the signal-to-noise ratio (SNR), which is the ratio of signal power to noise power.

This formula reveals a core truth:  to increase data transmission rate, either increase bandwidth or improve signal-to-noise ratio.

So, how do we use a channel to transmit bits? The answer is  modulation  . Modulation is the process of "translating" digital bits into analog signals. Common modulation methods include:

  • Amplitude Shift Keying (ASK)  : represents  0 the sum  by changing the amplitude (amplitude) of the signal 1.
  • Frequency Shift Keying (FSK)  : represents  0 the sum  by changing the frequency of the signal 1.
  • Phase Shift Keying (PSK)  : represents  0 the sum  by changing the phase of the signal 1.

In order to carry more bits in each signal unit (called  a symbol  ), modern communication systems often combine amplitude and phase, which is  quadrature amplitude modulation (QAM)  .

We can visually represent these modulation schemes using  an IQ constellation diagram  . Each point on the constellation represents a unique symbol, defined by a specific amplitude and phase. A symbol can encode multiple bits. For example, QPSK (Quadrature Phase Shift Keying) has four points, and each symbol can carry two bits.

Q (Quadrature)
        ^
        |
   10 o | o 00   (QPSK: 4个点, 2 bits/symbol)
        |
  ------+------> I (In-phase)
        |
   11 o | o 01
        |
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.

The more complex 16-QAM has 16 points, and each symbol can carry 4 bits, which greatly increases the data transmission rate. However, the points are more densely spaced, and the resistance to noise decreases accordingly.

The cornerstone of the physical layer: coding, multiplexing, and transmission delay

Now that we understand the channel capacity blueprint that Shannon's theorem paints for us, let's delve into some of the more specific technical details of the physical layer, which together form a solid foundation for data communications.

Nyquist's Criterion and Shannon's Theorem: Ideal and Reality

Shannon's theorem describes the rate limit of a noisy channel. In an ideal, noiseless channel, the upper limit of the data transmission rate is given by  the Nyquist Theorem  :

  • W is the channel bandwidth (Hz).
  • N It is the number of states that a symbol may represent.
  • 2W It is the maximum symbol rate of the channel, also known as the baud rate.

The key difference here is that Nyquist's criterion focuses on how bandwidth limits the transmission rate of code elements in a noise-free environment   , while Shannon's theorem focuses on how the signal-to-noise ratio limits  the transmission rate of information  (bits) in a noisy environment. Nyquist's criterion tells us "how many packages can be delivered at most," while Shannon's theorem tells us "how much cargo can be packed into those packages."

The relevant concepts are explained below.

A symbol  is the smallest modulation unit sent by the physical layer in each time slot—it can be thought of as a "signal point." A symbol can represent one bit or multiple bits, depending on the number of "points" (constellation points) in the modulation scheme. A bit  is the unit of information;  the symbol rate (or baud)  is the number of symbols transmitted per second;  and the bit rate  = symbol rate × number of bits carried by each symbol.

Example (step by step)

A symbol is a point in the physical signal; the number of bits a symbol carries depends on the size of the modulation constellation. Nyquist provided  an upper limit on the bandwidth versus symbol/bit rate in the absence of noise  ; Shannon lowered this limit in  the presence of noise  (introducing a limit on the signal-to-noise ratio).

Therefore: code element = signal "point", bit = amount of information, code element rate × number of bits per code element = bit rate.

Dissecting Network Latency: Transmission Latency and Propagation Latency

The time it takes for a data packet to travel from source to destination, also known as latency, consists of two main components:

  1. Transmission Delay  : The time required to push all the bits of a packet onto the link. It is equal to  数据帧大小 / 数据速率. This is like the time it takes for a train to completely leave the station.
  2. Propagation Delay  : The time it takes for the first bit to reach the receiver from the transmitter.  传输距离 / 信号传播速度It is like the time it takes for a train locomotive to travel from its origin to its destination.

For example, to send a 1500-byte packet over a 100Mbps Ethernet link with a line length of 100 meters (signal speed in  2 \cdot 10^8 m/s):

  • Transmit Delay =  (1500 * 8) bits / (100 * 10^6) bps = 120 µs
  • Propagation delay =  100 m / (2 * 10^8) m/s = 0.5 µs
  • Total delay ≈ Transmission delay + Propagation delay = 120.5 µs

In this example, the send delay is much larger than the propagation delay, which is typical in high-speed LANs.

Multiplexing: running multiple vehicles on one road

In order to improve channel utilization,  multiplexing  technology came into being, which allows multiple independent signals to share the same physical channel.

  • Frequency Division Multiplexing (FDM)  : This technology divides the total bandwidth of a channel into multiple non-overlapping sub-bands, each of which carries a single signal. Radio stations are a classic example of this.
  • Time Division Multiplexing (TDM)  : Time is divided into recurring frames, each frame is further divided into multiple time slots, and each time slot is allocated to a signal. T1/E1 digital trunk lines are a typical application of TDM.
  • Wavelength Division Multiplexing (WDM)  : Used in fiber-optic communications, WDM is essentially frequency-division multiplexing of light. It uses light of different wavelengths (colors) to carry different signals, greatly increasing the transmission capacity of optical fibers.

Link layer: reliable transmission on shared channels

The link layer is located above the physical layer. It uses the services provided by the physical layer to provide the network layer with host-to-host data transmission services on a "single-hop" link.

The Battle of Bits: Errors and Error-Correcting Codes

Because noise is ubiquitous in the physical world, the receiver is always likely to make mistakes when decoding the signal, resulting in bit errors. The lower the signal-to-noise ratio (SNR), or the denser the constellation, the higher the bit error rate (BER).

Retransmitting an entire data packet due to a single or two bit error would be extremely inefficient. To address this, engineers invented  forward error correction (FEC)  . Its core idea is to proactively add redundant information to the original data. This way, even if errors occur during transmission, the receiver can use this redundant information to "guess" and correct the error, avoiding costly retransmissions.

A classic and powerful FEC algorithm is  the Reed-Solomon (RS) code  . Its principle can be understood in a simple way:

  1. Consider the block  K raw data  K-1 as the coefficients of a degree polynomial.
  2. N Take a different point ( ) on this polynomial  N > Kand send the coordinate value of N this point as the encoded data.
  3. Since  K a point can uniquely determine a  K-1 sub-polynomial, as long as the receiver correctly receives any  K point, it can decompose the original polynomial and recover  K the original data of the entire block.

This means that a  RS(N, K) code can tolerate  N-K the loss of up to 1000 data blocks (erasure errors). This powerful error correction capability has made FEC widely used in Wi-Fi, cellular networks, and even optical disk storage.

We mentioned the power of forward error correction (FEC) earlier. However, in many scenarios, error correction isn't necessary. Simply detecting errors and requesting retransmission (which is handled by higher-layer protocols like TCP) is sufficient, saving significant computational overhead and redundant bits.

Cyclic Redundancy Check (CRC): An Efficient Error Detection Tool

Cyclic Redundancy Check (CRC)  is the most widely used error detection technology at the link layer (especially in Ethernet and Wi-Fi). It is based on the principle of polynomial division.

The workflow is as follows:

  1. The sender and receiver agree in advance on a  generator polynomialG(x) (for example,  x^4 + x + 1 corresponding to binary  10011).
  2. The sender wants to send  the original data of bits. It first  appends   which is the highest power of the generator polynomial) k to the data  .r0r
  3. Then, use this additional  0 long data string to perform "modulo 2 division" (i.e., XOR operation) on the binary number corresponding to the generating polynomial.
  4. The remaining  r bits are the CRC checksum. The sender replaces the last  r digit  of the data with it 0and then sends the entire data frame (original data + CRC checksum).
  5. After receiving the data frame, the receiver performs modulo 2 division on the same generator polynomial using the entire data frame. If the remainder is zero, the data transmission is considered error-free; otherwise, the data is considered damaged.

CRC has very strong error detection capability and can effectively detect most single-bit, multi-bit and burst errors.

Hamming code: a sophisticated error-correcting code

Unlike CRC, which can only detect errors,  Hamming Code  is a relatively simple but very sophisticated error-correcting code that can correct single-bit errors.

The core idea of ​​Hamming code m is  to insert k a check bit into the data to form a  m+k new codeword. These check bits are placed at  2^npositions (1, 2, 4, 8, ...). The value of each check bit is obtained by performing an exclusive-or operation on the bits at that specific position in the data.

When the receiver receives the codeword, it recalculates these check bits. If the calculated result doesn't match the received check bits,  the sum of the positions of these discrepant check bits  accurately indicates the location of the bit where the error occurred! For example, if the checksum of bits 1 and 4 is incorrect, then  1+4=5 the data in bit 1 has been flipped. The receiver simply needs to invert that bit to complete the error correction.

Clock Recovery: Finding the Beat in a Data Stream

We previously discussed using elastic buffers to handle small frequency variations in the clock. But there's a more fundamental problem: How does the receiver know when to sample the signal to read a bit? This is  the problem that clock recovery  solves.

In modern synchronous communications, such as Ethernet, clock information is cleverly encoded within the data signal itself.  A clock recovery unit (CRU) at the receiving end  continuously observes the received signal and uses signal transitions (e.g., from high to low) to lock onto the sender's clock.

To ensure that there are enough transitions in the signal, the link layer uses a specific  line coding  scheme. For example:

  • Manchester encoding  : Used in early 10Mbps Ethernet, it  1 encodes bits as high-low transitions and 0 low-high transitions. This ensures a transition between each bit, facilitating clock recovery, but at the cost of doubling bandwidth.
  • 4B/5B encoding  maps 4-bit data blocks into 5-bit codewords. These 5-bit codewords are carefully chosen to avoid excessive consecutive  0 ORs  1, ensuring sufficient transitions in the signal. It is much more efficient (25% overhead) than Manchester encoding.

Media Access Control: Who has the right to speak?

In many networks, multiple devices share the same communication medium (e.g., coaxial cable, air).  The Media Access Control (MAC)  protocol is a set of rules that addresses the question of "who can send data and when," with the goal of avoiding or resolving data conflicts.

The Rules of the Wired World: Ethernet and CSMA/CD

In the early days of Ethernet, all computers were connected to a shared bus. To coordinate communications, Ethernet used  the Carrier Sense Multiple Access with Collision Detection (CSMA/CD)  protocol.

Its workflow is like a civilized round table meeting:

  1. Carrier Sense  : Listen before speaking. Before sending data, the device will first listen to see if the channel is idle.
  2. Multiple Access  : If the channel is idle, start sending data.
  3. Collision Detection  : Talk and Listen. A device continuously monitors the channel while sending data. If the signal it hears is inconsistent with the signal it sent, it means a collision has occurred (two devices are "speaking" at the same time).
  4. Backoff and Retry  : Once a conflict is detected, the system immediately stops sending and broadcasts a congestion signal. It then waits for a random amount of time (the upper limit of this random amount of time increases exponentially with the number of conflicts, known as  binary exponential backoff  ) before restarting from the first step.

It's worth noting that, with the advancement of technology, modern Ethernet networks are almost entirely built using  switches  . Switches provide independent collision domains for each port and support  full-duplex  communication (simultaneous transmission and reception). This eliminates collisions and renders the CSMA/CD protocol obsolete in modern wired networks.

How do modern switches achieve full-duplex?

Switches transform multiple devices from a shared bus into many independent, point-to-point links. Point-to-point links can transmit and receive simultaneously (full-duplex), so there is no conflict problem of "two endpoints interfering with each other on the same shared medium at the same time" - the CSMA/CD scenario is eliminated by the physical structure.

The specific mechanism (several key points) is as follows.

  1. Independent point-to-point links
  • Early bus-type Ethernet (coaxial) was a "shared medium", so if two hosts sent at the same time, there would be a conflict.
  • The switch forwards for each port. The host to the switch is a separate link (one-to-one). The switch maintains separate send/receive channels at both ends, so A→Switch and B→Switch do not interfere with each other.
  1. Physically separated transmit and receive paths
  • In many media, transmission and reception use different wire pairs or different optical fibers. For example, optical fibers usually have one fiber for transmission and one fiber for reception. Gigabit/10G optical modules also have separate Tx/Rx ports. This allows for simultaneous transmission and reception.
  • In twisted-pair cables (such as 100BASE-TX), separate pairs of wires are often used for TX and RX, allowing for simultaneous bidirectional transmission.
  1. Full duplex even on the same pair of lines - echo cancellation + DSP
  • For example, 1000BASE-T (Gigabit Ethernet) uses four pairs of wires, each carrying both transmit and receive signals. To achieve simultaneous transmission and reception, the transmitted signal must be "subtracted" from the mixed received signal (  echo cancellation  ), followed by digital signal processing to separate the peer signal. Modern PHY chips utilize a hybrid transformer combined with DSP to accomplish this, enabling collision-free bidirectional transmission on the same pair.
  1. Switch forwarding and buffering
  • When a frame arrives, the switch learns its MAC address table and forwards it to its destination port. It typically has a high-speed buffer to handle short bursts and avoid frame loss. Because the link is point-to-point, frames are not disrupted by collisions with third hosts along the link.
  1. Link negotiation (autonegotiation)
  • The two sides of the Ethernet link determine whether to use full-duplex mode through automatic negotiation. If both sides support full-duplex and the negotiation is successful, the CSMA/CD logic is disabled and full-duplex communication mode is enabled.

Why CSMA/CD is "dead in name only" in switched networks

  • CSMA/CD was designed for  collision detection on shared media  , but on point-to-point full-duplex links, no third party would transmit concurrently with either end on the same physical channel, so there would be no detectable collisions. Since collisions wouldn't occur, collision detection/backoff mechanisms were pointless and naturally discontinued. (For compatibility, the standard retains the historical definition, but doesn't enable it in full-duplex mode.)
  • Although most modern networks are switched and full-duplex, the Ethernet standard retains historical constraints such as minimum frame length (primarily for backward compatibility with older equipment and specification consistency). In mature full-duplex networks, the practical requirements of minimum frame length and CSMA/CD are no longer important, but the standard fields still exist.

In-depth understanding of Ethernet: frame structure, minimum frame length and physical specifications

Now that we understand how CSMA/CD works, let's delve into the technical details of Ethernet and see how its data frames are constructed and why it has special requirements on the frame length.

Ethernet frame structure

A standard Ethernet Type II frame (currently the most commonly used type) consists of the following parts:

+----------------+----------------+-------------+--------------------+----------+
| 目的MAC地址    | 源MAC地址      | 类型        | 数据 (IP数据包等)    |  CRC校验 |
| (6字节)        | (6字节)        | (2字节)     | (46 - 1500字节)    |  (4字节) |
+----------------+----------------+-------------+--------------------+----------+
<--------------------------- 最小 64 字节,最大 1518 字节 ------------------------>
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.

When sent at the physical layer, an 8-byte  preamble  and  start frame delimiter (SFD) are added to the front of the frame  for clock synchronization and to mark the beginning of the frame. These are not included in the frame length.

  • Destination/Source MAC address  : A globally unique 48-bit physical address.
  • Type  : Indicates the upper layer protocol, such as  0x0800 IPv4.
  • Data  : Carries data packets from the network layer. If the data is less than 46 bytes, the link layer will automatically pad it to 46 bytes.
  • CRC check  : 4-byte cyclic redundancy check code, used to detect whether there are errors in the frame during transmission.
The mystery of minimum frame length

Why does the data field of an Ethernet frame have to be at least 46 bytes, making the entire frame length (excluding the preamble) at least 64 bytes? This is closely related to the collision detection mechanism of CSMA/CD.

In a shared-media network, the worst-case scenario is that A has just sent the last bit of a frame, and the first bit of that frame has just arrived at B, at the farthest end of the network. At that very moment, B also begins transmitting, causing a collision. The collision signal must be transmitted back from B to A. A must detect this collision before it completes its frame, otherwise it will mistakenly believe the transmission was successful.

Therefore, the following inequality must be satisfied:

The time it takes for a frame to be sent is the time it takes for a signal to go back and forth in the network.

Right now:

  • L_min is the minimum frame length (bits).
  • R is the data rate in bps.
  • d is the maximum span of the network (m).
  • v is the signal propagation speed (m/s).

The original 10Mbps Ethernet standard specified a maximum span of 2500 meters, which resulted in a minimum frame length of 512 bits (64 bytes). This standard was inherited by subsequent Fast Ethernet and Gigabit Ethernet, even though CSMA/CD was no longer necessary in switched networks.

Evolution of Ethernet Physical Layer Specifications

The naming convention for Ethernet is usually  速率 + 信号方式 + 介质/距离.

  • Fast Ethernet (100Mbps)  : The most common  100BASE-TX standard, using two pairs of Category 5 unshielded twisted pair (UTP) cables with a transmission distance of 100 meters.
  • Gigabit Ethernet (1Gbps)  : 1000BASE-T This standard dominates the desktop market, utilizing all four pairs of Category 5 UTP to achieve full-duplex Gigabit transmission over 100 meters. Also available for fiber are  1000BASE-SX(multimode fiber, short distances) and  1000BASE-LX(single-mode/multimode fiber, long distances).
  • 10 Gigabit Ethernet (10Gbps) and higher  : 10GBASE-T 10G transmission is possible over Category 6A twisted-pair cabling. In data centers and backbone networks, fiber optics are the dominant technology.  10GBASE-SR/LR/ER Standards such as [10Gbps] and [10Gbps] correspond to different fiber types and transmission distances (ranging from a few hundred meters to tens of kilometers).

Challenges in the Wireless World: CSMA/CA and Hidden Terminals

The situation is much more complex in wireless networks. Because signals attenuate as they propagate through the air, the sender cannot effectively "hear" what is happening at the receiver as they do in wired networks. Therefore, collision detection becomes impractical.

Wi-Fi (IEEE 802.11) uses another strategy:  Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA)  . Its core concept is to "try to avoid collisions, and acknowledge if that fails."

  1. Carrier sense  : Also sense the channel before sending.
  2. Collision avoidance  : Even if the channel is idle, a random backoff time must be waited before sending, because there may be other devices preparing to send.
  3. Link Layer ACK  : Because collision detection is impossible, the sender relies on a short acknowledgment frame (ACK) from the receiver to determine whether the data has been successfully delivered. If the ACK is not received within the specified time, the data is considered lost (possibly due to collision or noise), and exponential backoff and retransmission are performed.

However, CSMA/CA also faces challenges unique to wireless environments:

  • Hidden Terminal Problem  : Terminals A and C can both communicate with access point B, but A and C cannot hear each other. If A and C send data to B at the same time, a collision will occur at B, without A and C knowing about it.
A -----> B <----- C
   (A 和 C 互相听不到)
  • 1.
  • 2.
  • Exposed Terminal Problem  : Terminal B is sending data to A. Terminal C hears B's transmission. Although C's intention to send data to D will not interfere with the communication between B and A, according to the CSMA rules, C will remain silent, resulting in a waste of channel resources.

To alleviate the hidden terminal problem, Wi-Fi introduces an optional  Request-to-Send/Clear-to-Send (RTS/CTS)  mechanism. The sender can first send a short RTS frame, to which the receiver responds with a CTS frame. All devices (including hidden terminals) that hear the CTS frame remain silent for a specified period of time, clearing the field for subsequent data transmission. However, this mechanism itself incurs additional overhead.

Crossing Borders: IP Fragmentation

Different link layer technologies may have different MTUs. For example, Ethernet's MTU is typically 1500 bytes. When a large IP datagram needs to be forwarded from a link with a larger MTU (such as MTU = 9000) to a link with a smaller MTU (such as MTU = 1500), the router needs to  fragment the datagram  .

There are special fields in the IP header to handle fragmentation and reassembly:

  • Identification: A unique identifier; all fragments belonging to the same original datagram have the same  Identification value.
  • Flags: The  More Fragments (MF) bit, except for the last fragment, the MF bits of other fragments are all  1.
  • Fragment Offset: Indicates the position of the data of the current fragment in the original datagram (in units of 8 bytes).

A key design is that  fragments are only reassembled at the final destination host  . Intermediate routers are not responsible for reassembly, they only forward.

However, IP fragmentation is an operation that should be avoided as much as possible because it is very fragile - the loss of any fragment will result in the loss of the entire original data packet, and the upper-layer protocol (such as TCP) must retransmit the entire large data packet, which greatly affects performance.

Modern networks often use  Path MTU Discovery (PMTUUD)  to avoid fragmentation. When a TCP connection is established, it sends a  Don't Fragment probe packet with the (DF) bit set. It uses error messages returned by routers along the way to discover the minimum MTU along the entire path, and then adjusts its packet size accordingly.

IP fragmentation example analysis

To understand IP fragmentation more clearly, let's look at a specific calculation example.

Scenario  : A host wants to send an IP datagram with a total length of 3820 bytes (20 bytes for the header and 3800 bytes for the data). The datagram needs to pass through a link with an MTU of 1500 bytes.

Sharding process

  1. Determine the maximum amount of data each fragment can carry  : The MTU is 1500 bytes, and the IP header takes up 20 bytes, so each fragment can carry a maximum  of 1500 bytes 1500 - 20 = 1480 of data. To make  Fragment Offset calculations easier, data lengths are usually rounded to multiples of 8. 1480 happens to be a multiple of 8 ( 1480 / 8 = 185).
  2. Shard 1
  • Data : Carries the first 1480 bytes of original data.
  • Total length : 1480 (数据) + 20 (头部) = 1500 bytes.
  • MF (More Fragments) flag : set to  1 (more fragments to follow).
  • Fragment Offset : 0 / 8 = 0 (This is the first fragment).
  1. Shard 2
  • Data  : Carries the 1481st to 2960th bytes of the original data (a total of 1480 bytes).
  • Total length  : 1480 (数据) + 20 (头部) = 1500 bytes.
  • MF flag  : set to  1 (fragments to follow).
  • Fragment Offset  : 1480 / 8 = 185 (The offset is the data length of the previous fragment divided by 8).
  1. Shard Three
  • Data  : Carries the remaining data. The original data is 3800 bytes in total,  1480 + 1480 = 2960 bytes have been sent, and  3800 - 2960 = 840 bytes are left.
  • Total length  : 840 (数据) + 20 (头部) = 860 bytes.
  • MF flag  : set to  0 (this is the last fragment).
  • Fragment Offset :2960 / 8 = 370

Ultimately, the 3820-byte IP datagram is divided into three separate small IP packets and transmitted across the network until they are reassembled at the final destination host.

Conclusion: A solid foundation

From elastic buffers that handle microsecond clock discrepancies to modulation and coding guided by Shannon's law; from MAC protocols that avoid data collisions to IP fragmentation across different links, the physical and link layers together form the solid foundation of network communications. They abstract signals from the physical world, full of noise and uncertainty, into seemingly reliable point-to-point links that upper-layer protocols can trust.

It's these underlying, ingenious designs that support the vast and complex internet we live in today. I hope this blog post helps you better understand the wisdom and charm behind it all.