All about File Transfers

In this issue, I am going to talk about File Transfer Protocols. This particular article is going to be a wee bit technical, but I'll try to make it as human as possible.

What is a File Transfer Protocol ?

Many times you will need to send or receive information that cannot be handled using a text based message. Examples of this are data files, programs or any other kind of file that are not text based.

Unlike text messages, binary files cannot tolerate even the slightest error or anomaly. To explain this, let us take the example of a simple message :

Deer Raju,

Hi! hou arre yoo ? Deed yu geth de letyer Aih sended yastrdey ?

Rply suuuun,

Atul.

In the example shown above, it is fairly obvious that the writer (in this case me, and this is an example - I was NOT sleeping during grammar and spelling classes at school) has made some gross errors in both spellings and grammar. Yet the message, once received by the addressee, will carry its meaning across, inspite of all the errors. This is because the interpretation of the message is left to the human brain, which is much more accomadating (or fault tolerant) than a computer can ever be.

But take the example of a binary program that, let's say, reads and displays sectors of a disk. It uses the DOS interrupt 25H to read the sector, and then displays it to you. By itself, this represents a relatively innocent and safe utility program, but if it is even slightly altered, it can be a potential monster. If this program, sent over the telephone line using modems, gets "hit" by line noise or interference (as described in the first of these articles), then a simple bit shift can turn the Interrupt 25H instruction into Interrupt 26H. This is fatal to the user of the program, because Interrupt 26H does not READ from a disk sector, but WRITES to it, and the innocent program you sent potentially can wipe out the addressee's hard disk !

Or take the example of a database file that contains salary information. A single hit, and Mr.ABC's salary for the next month suddenly jumps a hundred fold, causing losses to the company ! This is not too uncommon, and you yourself may have experienced it if you have ever sent a fax or telex message that got garbled.

To prevent this sort of thing from happening, we use a system called "Protocol Driven File Transfers" involving "Error Detecting and Correcting Methods". In plain English this means that a File Transfer Protocol lets you transfer files without a chance of an error creeping into the file. If one does creep in, it is detected, and corrected.

A file transfer protocol is an agreed upon method or set of rules (between the sender and receiver) of transfering file information. These rules, unique to each protocol, form the basis of the comparitive strengths and weaknesses of differing communications protocols.

There are literally hundreds of file transfer protocols used today, but the most common ones are XMODEM, XMODEM-1K, ZMODEM, YMODEM BATCH, YMODEM-G BATCH and KERMIT.

Most good communication packages support all the above mentioned protocols, both for sending as well as receiving files.

Selecting a Protocol

Any file transfer begins with your selection of File Transfer Protocol (FTP) that you wish to use. Naturally you must choose a FTP that is supported by both sides of the connection.

Which transfer protocol should you use ? There is no definite answer to this question, but there ARE some guidelines which you can follow :

1. First of all, check which of the protocols is supported by your communication program. You can determine the FTPs available in your communication program by going through its manual. If you are using ProComm, hit to see a menu of available protocols.

2. What is the line condition like ? If you are NOT using an error correcting modem, and frequently see junk on the screen, then it is advisable that you use either of the "short packet" FTPs, i.e. XMODEM or KERMIT. Though the throughput of these FTPs is less (slower) than that of XMODEM-1K, YMODEM BATCH or ZMODEM, they provide faster error recovery.

3. Are you calling using a 7 bit line ? Normally you shouldn't, because most Communication Hosts specify that you should call using 8 bits no parity, but in some cases using a 7 bit setting may be unavoidable, such as when you are calling from a mini computer terminal or your communication software does not permit anything else. If you ARE calling at 7 bits, then you have only one choice - KERMIT, which is the only FTP that works on a 7 bit connection.

4. Are you calling using an error correcting modem ? If yes, then the high-throughput FTPs XMODEM-1K, YMODEM BATCH, YMODEM-G BATCH or ZMODEM should be used. Since the chances of an error occuring are remote, these FTPs allow you to push through larger data packets, speeding up the process immensely.

OK, now that you have decided which protocol you wish to use, indicate your choice to the remote host (who usually shows you ITS available protocols in the form of a list or a menu).

What happens next is dependant on the choice of FTP, the operation (upload/send or download/receive) and the procedure from where the file transfer was initiated.

Transferring a File

If the transfer involves a download (receiving a file), then the remote host will almost always ask you for the name of the file you wish to receive (unless there isn't any choice, since there is only one file).

If the file requested is found, then the remote host will ask you to initiate your receive procedure using the specified protocol, and will await your start signal. At this point, you can usually abort the transfer by pressing a couple of times.

To initiate the transfer at your end, you have to instruct your communication package to begin downloading the file with the matching protocol. If you are using ProComm, then hit and select the same protocol as you have instructed the remote host to use. (If you are not using ProComm, you can determine the exact procedure from your communication software manual.) In the case of either of the XMODEM FTPs, you will have to tell your program what file name to use at your end, while the YMODEM BATCH, YMODEM-G BATCH, KERMIT and ZMODEM protocols do that automatically for you.

Now you will usually see a status screen on your terminal, telling you about the progress of the transfer. The actual transfer should begin within 1-10 seconds. If it does not, then something is wrong, and you will have to cancel the transfer.

Once the transfer begins, you will see the number of blocks of data that your computer has received, and the error status. Don't be worried if you see errors being reported. This is actually a positive sign - it means that the FTP has detected an error, and is correcting it ! Of course the non-appearance of errors is a good sign, too.

After the transfer completes, your communication program will tell you so.

That's it ! The file has been successfully transferred ! If it has not (because of excessive or fatal errors, or because you chose to abort the transfer), you will usually be given an option to retry. Answering "YES" will restart the process from the point of the FTP selection, answering "NO" aborts the process and returns you to the calling procedure.

More about File Transfer Protocols

Although each file transfer protocol differs in the specific rules that it follows in transferring information, all FTPs have certain similarities:

* Each Communication session begins with an initialization state where the receiver and sender establish the specific method of information transfer. (Techno-jargon that means that both sides agree on which "language" they are going to speak in)

* The contents of a file are transmitted in the form of packets or frames of an agreed upon format.

* All of the protocols are "stop and wait" protocols where after sending a packet or frame, the sender stops and waits for a response to the sent packet. Ymodem-G Batch and Zmodem are exceptions to this - they blast data across "full throttle" unless specifically told to pause by the protocol at the other end.

* Each packet or frame has a Start-of-Header character or sequence which indicates to the receiver the beginning of the packet or frame.

* Each packet or frame is uniquely identified by a frame number or sequence number.

* The integrity of the frame or packet is assured through the use of an error detection code typically placed at the end of the packet or frame.

* The packet or frame contains a data portion which is the file information being transferred.

Like I said, the basic reason for the use of a file transfer protocol is to maintain the integrity of transmitted and received data. Since all communications may be subjected to noise or other data corrupting forces, use of a communication protocol will insure that all information is transmitted and received error free.

Let's take a closer look at each of the common protocols.

XMODEM and XMODEM-1K :

This protocol is the "grand-daddy" of them all, since it was the first one ever to appear for general usage. It was conceived and developed by Ward Christiansen in the late 70's, and gained rapid popularity because of its straight-forward implementation, and wide spread support on many BBSs.

XMODEM is receiver-initiated - both the receiver and sender computers are aware of the file transfer. XMODEM is generally used to download a file from a host to a user's PC that is operating as a terminal emulator. File name is not preserved and receiver and sender must both specify the file name of the file being transfered. File length is not preserved and is padded to the nearest 128 byte (XMODEM) or 1024 byte (XMODEM-1K) increment.

The XMODEM session begins with an interchange of initialization characters in order to establish the format of the data transfer. Once the session has been initialized, the sender begins transmitting information starting with frame number 1. The session progresses with the sender transmitting a data frame and the receiver responding to the data frame. Upon successful transfer of the last data frame, the sender indicates to the receiver that the session has completed.

YMODEM BATCH :

YMODEM BATCH is a receiver initiated protocol. Essentially, it is an embellishment of the XMODEM-1K protocol which provides CRC error detection. It can use 1K data blocks, multiple files can be transferred within one session, and both the transfered file name and exact file length are sent to the receiver.

Like XMODEM, the YMODEM BATCH session begins with an interchange of initialization characters in order to establish the format of the data transfer. Once the session has been initialized, the sender begins transmitting information starting with packet number 0. Packet 0 contains the file name and (optionally) the file length information of the file data to be transferred. The session progresses with the sender transmitting a data packet and the receiver responding to the data frame. Upon successful transfer of the last data packet, the sender indicates to the receiver that the session has completed.

Since the filename of each file being transferred is preserved (from sender to receiver), the receiving YMODEM BATCH session need only specify the destination drive/directory for the path argument of the YMODEM BATCH receive function calls.

YMODEM-G BATCH :

This is probably the fastest possible file transfer protocol that exists today.

YMODEM-G BATCH is, essentially, an embellishment of the YMODEM BATCH protocol which provides CRC error detection, 1K data blocks, multiple files transferred within one session, and preservation of both the transfered file name and exact file length. In addition, this is a "flowing" protocol, especially meant for the use with Error Correcting (MNP) modems.

YMODEM-G BATCH is almost identical to YMODEM-BATCH, but in the case of YMODEM-G BATCH, the receiver does not respond to the data frame, unless an error takes place, in which case the protocol is aborted.

KERMIT :

The Kermit file transfer protocol was developed in 1981 by Frank da Cruz and Bill Catchings at Columbia University in order to facilitate the transfer of information between mainframes and PCs.

Kermit provides for multiple files to be transferred within one session and for preservation of both the file name and exact file length. It also allows for communication to take place on both 7 and 8 bit communication channels (This is accomplished via the 8th bit prefixing option. Any 8 bit data is prefaced with a special character that tell the receiver to treat the next character as an 8 bit byte).

Kermit transfers all information in the form of packets. After a packet is sent, Kermit will stop and wait for a response (in the form of a packet) before it sends the next packet. All Kermit packets are composed of printable characters

Kermit packets can be a maximum of 96 bytes in length and use the 8 bit folded into 6 bit checksum error detection method to insure data integrity.

This is the only protocol that can be used to transfer binary data over 7 bit lines, but does have the disadvantage of being less throughput efficient than any other protocol.

ZMODEM :

Zmodem is sometimes referred to as the "King of FTPs", and with good reason. It can do things that almost no other protocol can do, such as receover from a crash, by continuing the transfer from the point where the transfer failed, rather than sending all the data once more.

Zmodem also has a unique "Auto Download" feature, that automatically starts the transfer at the receiver's end (provided this feature is enabled at the receiving end).

The protocol was developed by Chuck Forsberg, a developer at a company called Omen Technology, on request of the carrier network giant Telenet, for a very specific purpose : to overcome the pitfalls of packet switching networks.

Packet switching networks are data links similar to telephone lines, but specific to computer communication. They too use a form of data packetisation, which interferes with the functioning of normal file transfer protocols. (More on packet switching networks, including I-NET, in my next article).

Omen Technology developed the protocol, which was later placed in the public domain by Telenet.

Zmodem, like Ymodem-G Batch, uses the streaming style of data transfer, i.e. it doesn't wait for an acknowledgement of a packet before sending the next one. But unlike Ymodem-G Batch, it recognises error messages from the receiver, and takes action to rectify the problem.

Winding Down

Woops, out of space again ! (How words fly)

OK, my apologies about the "techno-junkie" feel of this article. Next month, I look at various ways to broaden your horizons, including BBSs and Carrier Networks - in particular I-NET, India's premier network.

Till then,

Ciao !