Remote data recovery over the Internet is a convenient, cost-effective and efficient way to provide data recovery services to off-site clients. As an alternative to shipping disks by post, data recovery has far fewer logistical obstacles and security risks. Mishandling or extreme temperatures during shipping can further damage the disk, drastically reducing the chances of recovery. Furthermore, a disk in the mail can be lost, or worse, stolen by identity thieves, corporate competitors or foreign spies.
Remote data recovery over the Internet obviates the need to entrust your precious data in the hands of a shipping service or mailroom. The wide availability and affordability of fast broadband Internet access also makes remote data recovery over network feasible for more customers in more corners of the globe.
R-Studio is our professional data recovery program that provides a full range of remote data recovery tools. These tools allow you to service your data recovery customers over the Internet as if you were actually sitting in front of their workstation. We discuss remote data recovery in greater detail in our article R-Studio: Data Recovery over Network.
While R-Studio offers a number of conveniences and process improvements, there are a few potential barriers to remote data recovery that are unique to physically shipping disks. Chief among these issues is providing a secure connection between an R-Studio workstation and the customer’s computer. Successfully connecting the host and client machines is further complicated by corporate networks, firewalls and other in-house network infrastructure.
To understand the challenges of creating a secure Internet connection suitable for data recovery, it’s best to begin with a bit of general information about Internet networking. Each device on the Internet—be it a server, computer, router, printer or smartphone—has its own unique Internet Protocol (IP) address. This is like a device’s telephone number on the Internet; if another machine wants to contact it, it “calls” it via the known IP address. For example, 18.104.22.168 is the IP address of one of the Google servers. Domain names, such as www.google.com are like “speed dial” for these IP addresses. When you type in www.google.com into your browser, what you’re actually doing is querying a domain name server (DNS) that will look up the IP address associated with www.google.com and then route you to the appropriate server on the Internet.
The issue with connecting to a data recovery client’s computer via an IP address is that most computers are connected to the Internet via a corporate network. That is, they access the Internet by first passing through an intermediary, such as a corporate firewall or a corporate network address translation (NAT) device. Such network infrastructure allows companies to exercise greater control over the flow of Internet data both in and out of the company. In general, there are two types of networks: private networks and public networks. The key difference between these two types of networks is how computers on these network types are made visible to computers on the Internet.
On a public network, each computer has its own unique IP address that is visible to the entire Internet. The firewall may serve to restrict certain types of traffic or traffic from unauthorized users, but in general, these machines are readily and transparently accessible to anyone connected to the Internet. This is similar to having a direct phone line at your company that rings your desk phone without going through the operator or administrative assistant. That phone number is unique to your phone and calls to that number are not screened as thoroughly (i.e. anyone with the number can call you). It is the same with the IP address of a computer on a public network. For this reason, connecting to one of these machines for data recovery is typically very easy. Unless the firewall is specifically set up to block traffic from the data recovery technician’s machine, a connection can be made using the public IP address of the computer. Accessing a computer behind a firewall in such a manner is called firewall traversal.
The problem is that the vast majority of companies, organizations and homes do not use a public network. This is primarily for security purposes, but for many users, it’s also for economic reasons. Unless you are your own Internet service provider (ISP), then you are on a private network.
If a public network is like giving your computer a direct phone line to the outside world, a private network is like giving your phone a company extension specific to the building. In order to reach you, outside callers must first go through the operator, who will route the call accordingly, or punch in your extension after dialing the main phone number for the entire company.
The concept is similar for a private network, except computers outside the private network won’t know your computer’s private IP address, nor will this private IP address be useful. Instead of each computer having its own public IP address that is unique on the Internet, each computer on a private network has an IP address that is unique within the network, for example 192.168.1.192, 172.16.2.23, or 10.10.10.10.
The only machine that has a unique public IP address is the NAT device. In order to reach one of the machines on a private network, a computer on the Internet must first communicate with the NAT device. The NAT device is responsible for routing the data to the appropriate machine on the private network. This is known as NAT traversal. Although each computer on a private network will be accessed using the IP address of the NAT device, each computer will have its own port through which it receives Internet data. This is similar to a phone extension. Again, computers on the Internet will be communicating via the NAT device’s IP address and a port number, but will never know the other computer’s private IP address.
Similar to using your desk phone to make an outside call, accessing the Internet from a computer on a private network is significantly easier than reaching a computer on a private network from the Internet. But it can be done.
There are many advantages to a private network. As mentioned above, it requires only one IP address for an entire network of devices. But more importantly, a private network adds an additional layer of computer security, since the NAT acts as a gatekeeper for all Internet traffic. This helps prevent trojans, spyware, malware, viruses, spam and hacking attempts. Recall from above that with a public network, anyone can access a computer unless specifically blocked. With a private network, the opposite is true: no one can access a computer on a private network unless they’ve been specifically given access. While this is effective at deterring malicious attacks and unwanted traffic, it makes it equally as difficult to establish a connection for legitimate purposes, such as data recovery over the Internet. Without knowing the port that the NAT has assigned to the computer on the private network, the data recovery technician’s machine won’t know how to reach the customer’s computer.
As mentioned above, access to the Internet from a computer on a private network is usually a one-way street. But you can give a computer on a private network the equivalent of a “direct line” through port forwarding. Port forwarding is when a NAT device permanently assigns a port to a specific machine on the network. For example, if the NAT device with an IP address of 22.214.171.124 forwarded port 2083 to a computer, the IP address that a computer would use to reach it via the Internet would be something like 126.96.36.199:2083. This IP address/port combination can be used much like a public IP address in order to establish a reliable connection to a machine on a private network. This is known as NAT traversal.
NAT Traversal and Data Recovery over the Internet
While port forwarding is the typical solution for NAT traversal, most customers who are on a private network will not have the necessary network privileges or technical know-how to forward a port. This means that allowing an incoming connection from the data recovery technician may not be feasible.
Fortunately, R-Studio lets you perform file recovery operations over the Internet using a connection initiated by the customer. This is similar to the way a computer would access a website on the Internet from behind a private network. The only additional step is to set up port forwarding on the NAT device that is connected to the data recovery technician’s computer. In this way, all of the technically challenging aspects of data recovery over the Internet are left up to the technician, while the process remains user-friendly for the customer.
Below is an example of how this connection may be set up.
Here, the customer is running R-Studio Agent, while the technician is running R-Studio. Both are behind a private network. The R-Studio technician forwards a port to allow access from the R-Studio Agent program on the customer’s computer. This allows the R-Studio technician to “accept” the incoming connection from the customer. Meanwhile, the customer does not compromise the security of the corporate network they are on. This is a good trade-off between corporate network security and ease of use for the customer. Likewise, since the R-Studio technician has only opened that specific port for use by R-Studio Agent, their network remains secure as well.
As illustrated above, the first challenge to successfully performing data recovery over the Internet is securing a reliable connection. Private networks complicate this matter, but port forwarding and R-Studio’s ability to establish connections initiated by the client make this process easy. With this obstacle removed, data recovery over the Internet stands as the most cost-effective, efficient and secure alternative to physically shipping disks or deploying technicians into the field.
For a more in-depth illustration of how the above connection can be established, read our article: R-Studio Technician: Data Recovery over the Internet, which presents a field test of such network layout for remote data recovery over the Internet.