A
computer virus is a type of malware that, when executed, replicates
by inserting copies of itself (possibly modified) into other computer
programs, data files, or the boot sector of the hard drive; when this
replication succeeds, the affected areas are then said to be "infected".
Viruses often perform some type of harmful activity on infected hosts, such
as stealing hard disk space or CPU time, accessing private information,
corrupting data, displaying political or humorous messages on the user's
screen, spamming their contacts, or logging their keystrokes. However, not
all viruses carry a destructive payload or attempt to hide themselves—the
defining characteristic of viruses is that they are self-replicating
computer programs which install themselves without the user's consent.
Virus writers use social engineering and exploit detailed knowledge of
security vulnerabilities to gain access to their hosts' computing resources.
The vast majority of viruses (over 99%) target systems running Microsoft
Windows, employing a variety of mechanisms to infect new hosts, and often
using complex anti-detection/stealth strategies to evade antivirus software.
Motives for creating viruses can include seeking profit, desire to send a
political message, personal amusement, to demonstrate that a vulnerability
exists in software, for sabotage and denial of service, or simply because
they wish to explore artificial life and evolutionary algorithms.
Computer viruses currently cause billions of dollars worth of economic
damage each year, due to causing systems failure, wasting computer
resources, corrupting data, increasing maintenance costs, etc. In response,
free, open-source anti-virus tools have been developed, and a multi-billion
dollar industry of anti-virus software vendors has cropped up, selling virus
protection to Windows users. Unfortunately, no currently existing anti-virus
software is able to catch all computer viruses (especially new ones);
computer security researchers are actively searching for new ways to enable
antivirus solutions to more effectively detect emerging viruses, before they
have already become widely distributed.
Vulnerabilities and infection vectors
Software bugs
Because software is often designed with security features to prevent
unauthorized use of system resources, many viruses must exploit security
bugs (security defects) in system or application software to spread.
Software development strategies that produce large numbers of bugs will
generally also produce potential exploits.
Social engineering and poor security practices
In order to replicate itself, a virus must be permitted to execute code and
write to memory. For this reason, many viruses attach themselves to
executable files that may be part of legitimate programs (see code
injection). If a user attempts to launch an infected program, the virus'
code may be executed simultaneously.
In operating systems that use file extensions to determine program
associations (such as Microsoft Windows), the extensions may be hidden from
the user by default. This makes it possible to create a file that is of a
different type than it appears to the user. For example, an executable may
be created named "picture.png.exe", in which the user sees only
"picture.png" and therefore assumes that this file is an image and most
likely is safe, yet when opened runs the executable on the client machine.
Vulnerability of different operating systems to viruses
The vast majority of viruses target systems running Microsoft Windows. This
is due both to Microsoft's large market share of desktop users (over 95%),
and to design choices in Windows that make it much easier for viruses to
infect hosts running Windows. Also, the diversity of software systems on a
network limits the destructive potential of viruses and malware. Open-source
operating systems such as Linux allow users to choose from a variety of
desktop environments, packaging tools, etc. which means that malicious code
targeting any one of these systems will only affect a subset of all users.
However most Windows users are running the same set of applications, so
viruses are able to rapidly spread amongst Windows systems by targeting the
same exploits on large numbers of hosts.
Theoretically, other operating systems are also susceptible to viruses, but
in practice these are extremely rare or non-existent, due to much more
robust security architectures in Unix-like systems (including Linux and Mac
OS X) and to the diversity of the applications running on them. There are no
known viruses that have spread "in the wild" for Mac OS X. The difference in
virus vulnerability between Macs and Windows is a chief selling point, one
that Apple uses in their Get a Mac advertising.
While Linux (and Unix in general) has always natively prevented normal users
from making changes to the operating system environment without permission,
Windows users are generally not prevented from making these changes, meaning
that viruses can easily gain control of the entire system on Windows hosts.
This difference has continued partly due to the widespread use of
administrator accounts in contemporary versions like XP. In 1997,
researchers created and released a virus for Linux—known as "Bliss". Bliss,
however, requires that the user run it explicitly, and it can only infect
programs that the user has the access to modify. Unlike Windows users, most
Unix users do not log in as an administrator user except to install or
configure software; as a result, even if a user ran the virus, it could not
harm their operating system. The Bliss virus never became widespread, and
remains chiefly a research curiosity. Its creator later posted the source
code to Usenet, allowing researchers to see how it worked.
Infection targets and replication techniques
Computer viruses infect a variety of different subsystems on their hosts.
One manner of classifying viruses is to analyze whether they reside in
binary executables (such as .EXE or .COM files), data files (such as
Microsoft Word documents or PDF files), or in the boot sector of the host's
hard drive (or some combination of all of these).
Resident vs. non-resident viruses
A
memory-resident virus (or simply "resident virus") installs itself
as part of the operating system when executed, after which it remains in RAM
from the time the computer is booted up to when it is shut down. Resident
viruses overwrite interrupt handling code or other functions, and when the
operating system attempts to access the target file or disk sector, the
virus code intercepts the request and redirects the control flow to the
replication module, infecting the target. In contrast, a non-memory-resident virus (or "non-resident virus"), when executed,
scans the disk for targets, infects them, and then exits (i.e. it does not
remain in memory after it is done executing).
Macro viruses
Many common applications, such as Microsoft Outlook and Microsoft Word,
allow macro programs to be embedded in documents or emails, so that the
programs may be run automatically when the document is opened. A macro
virus (or "document virus") is a virus that is written in a macro
language, and embedded into these documents so that when users open the
file, the virus code is executed, and can infect the user's computer. This
is one of the reasons that it is dangerous to open unexpected attachments in
e-mails.
Boot sector viruses
Boot sector viruses
specifically target the boot sector/Master Boot Record (MBR) of the host's
hard drive or removable storage media (flash drives, floppy disks, etc.).
Stealth strategies
n order to avoid detection by users, some viruses employ different kinds of
deception. Some old viruses, especially on the MS-DOS platform, make sure
that the "last modified" date of a host file stays the same when the file is
infected by the virus. This approach does not fool antivirus software,
however, especially those which maintain and date cyclic redundancy checks
on file changes.
Some viruses can infect files without increasing their sizes or damaging the
files. They accomplish this by overwriting unused areas of executable files.
These are called cavity viruses. For example, the CIH virus, or
Chernobyl Virus, infects Portable Executable files. Because those files have
many empty gaps, the virus, which was 1 KB in length, did not add to the
size of the file.
Some viruses try to avoid detection by killing the tasks associated with
antivirus software before it can detect them.
As computers and operating systems grow larger and more complex, old hiding
techniques need to be updated or replaced. Defending a computer against
viruses may demand that a file system migrate towards detailed and explicit
permission for every kind of file access.
Read request intercepts
While some antivirus software employ various techniques to counter stealth
mechanisms, once the infection occurs any recourse to clean the system is
unreliable. In Microsoft Windows operating systems, the NTFS file system is
proprietary. Direct access to files without using the Windows OS is
undocumented. This leaves antivirus software little alternative but to send
a read request to Windows OS files that handle such requests. Some viruses
trick antivirus software by intercepting its requests to the OS. A virus can
hide itself by intercepting the request to read the infected file, handling
the request itself, and return an uninfected version of the file to the
antivirus software. The interception can occur by code injection of the
actual operating system files that would handle the read request. Thus, an
antivirus software attempting to detect the virus will either not be given
permission to read the infected file, or, the read request will be served
with the uninfected version of the same file.
The only reliable method to avoid stealth is to boot from a medium that is
known to be clean. Security software can then be used to check the dormant
operating system files. Most security software relies on virus signatures,
or they employ heuristics.
Security software may also use a database of file hashes for Windows OS
files, so the security software can identify altered files, and request
Windows installation media to replace them with authentic versions. In older
versions of Windows, file hashes of Windows OS files stored in Windows—to
allow file integrity/authenticity to be checked—could be overwritten so that
the System File Checker would report that altered system files are
authentic, so using file hashes to scan for altered files would not always
guarantee finding an infection.
Self-modification
Most modern antivirus programs try to find virus-patterns inside ordinary
programs by scanning them for so-called virus signatures.
Unfortunately, the term is misleading, in that viruses do not possess unique
signatures in the way that human beings do. Such a virus signature is merely
a sequence of bytes that an antivirus program looks for because it is known
to be part of the virus. A better term would be "search strings". Different
antivirus programs will employ different search strings, and indeed
different search methods, when identifying viruses. If a virus scanner finds
such a pattern in a file, it will perform other checks to make sure that it
has found the virus, and not merely a coincidental sequence in an innocent
file, before it notifies the user that the file is infected. The user can
then delete, or (in some cases) "clean" or "heal" the infected file. Some
viruses employ techniques that make detection by means of signatures
difficult but probably not impossible. These viruses modify their code on
each infection. That is, each infected file contains a different variant of
the virus.
Encrypted viruses
One method of evading signature detection is to use simple encryption to
encipher the body of the virus, leaving only the encryption module and a
cryptographic key in cleartext. In this case, the virus consists of a small
decrypting module and an encrypted copy of the virus code. If the virus is
encrypted with a different key for each infected file, the only part of the
virus that remains constant is the decrypting module, which would (for
example) be appended to the end. In this case, a virus scanner cannot
directly detect the virus using signatures, but it can still detect the
decrypting module, which still makes indirect detection of the virus
possible. Since these would be symmetric keys, stored on the infected host,
it is in fact entirely possible to decrypt the final virus, but this is
probably not required, since self-modifying code is such a rarity that it
may be reason for virus scanners to at least flag the file as suspicious.
An old, but compact, encryption involves XORing each byte in a virus with a
constant, so that the exclusive-or operation had only to be repeated for
decryption. It is suspicious for a code to modify itself, so the code to do
the encryption/decryption may be part of the signature in many virus
definitions.
Polymorphic code
Polymorphic code was the first technique that posed a serious threat to
virus scanners. Just like regular encrypted viruses, a polymorphic virus
infects files with an encrypted copy of itself, which is decoded by a
decryption module. In the case of polymorphic viruses, however, this
decryption module is also modified on each infection. A well-written
polymorphic virus therefore has no parts which remain identical between
infections, making it very difficult to detect directly using signatures.
Antivirus software can detect it by decrypting the viruses using an
emulator, or by statistical pattern analysis of the encrypted virus body. To
enable polymorphic code, the virus has to have a polymorphic engine (also
called mutating engine or mutation engine) somewhere in its encrypted body.
See polymorphic code for technical detail on how such engines operate.
Some viruses employ polymorphic code in a way that constrains the mutation
rate of the virus significantly. For example, a virus can be programmed to
mutate only slightly over time, or it can be programmed to refrain from
mutating when it infects a file on a computer that already contains copies
of the virus. The advantage of using such slow polymorphic code is that it
makes it more difficult for antivirus professionals to obtain representative
samples of the virus, because bait files that are infected in one run will
typically contain identical or similar samples of the virus. This will make
it more likely that the detection by the virus scanner will be unreliable,
and that some instances of the virus may be able to avoid detection.
Metamorphic code
To avoid being detected by emulation, some viruses rewrite themselves
completely each time they are to infect new executables. Viruses that
utilize this technique are said to be metamorphic. To enable metamorphism, a
metamorphic engine is needed. A metamorphic virus is usually very large and
complex. For example, W32/Simile consisted of over 14,000 lines of assembly
language code, 90% of which is part of the metamorphic engine.
Countermeasures
Many users install antivirus software that can detect and eliminate known
viruses when the computer attempts to download or run the executable (which
may be distributed as an email attachment, or on USB flash drives, for
example). Some antivirus software blocks known malicious web sites that
attempt to install malware. Antivirus software does not change the
underlying capability of hosts to transmit viruses. Users must update their
software regularly to patch security vulnerabilities ("holes"). Antivirus
software also needs to be regularly updated in order to recognize the latest
threats. The German AV-TEST Institute publishes evaluations of antivirus
software for Windows and Android.
Examples of Microsoft Windows anti virus and anti-malware software include
the optional Microsoft Security Essentials (for Windows XP, Vista and
Windows 7) for real-time protection, the Windows Malicious Software Removal
Tool (now included with Windows (Security) Updates on "Patch Tuesday", the
second Tuesday of each month), and Windows Defender (an optional download in
the case of Windows XP). Additionally, several capable antivirus software
programs are available for free download from the Internet (usually
restricted to non-commercial use). Some such free programs are almost as
good as commercial competitors. Common security vulnerabilities are assigned
CVE IDs and listed in the US National Vulnerability Database. Secunia PSI is
an example of software, free for personal use, that will check a PC for
vulnerable out-of-date software, and attempt to update it. Ransomware and
phishing scam alerts appear as press releases on the Internet Crime
Complaint Center noticeboard.
Other commonly used preventative measures include timely operating system
updates, software updates, careful Internet browsing, and installation of
only trusted software.
There are two common methods that an antivirus software application uses to
detect viruses, as described in the antivirus software article. The first,
and by far the most common method of virus detection is using a list of
virus signature definitions. This works by examining the content of the
computer's memory (its RAM, and boot sectors) and the files stored on fixed
or removable drives (hard drives, floppy drives, or USB flash drives), and
comparing those files against a database of known virus "signatures". Virus
signatures are just strings of code that are used to identify individual
viruses; for each virus, the anti-virus designer tries to choose a unique
signature string that will not be found in a legitimate program. Different
anti-virus programs use different "signatures" to identify viruses. The
disadvantage of this detection method is that users are only protected from
viruses that are detected by signatures in their most recent virus
definition update, and not protected from new viruses (see "zero-day
attack").
A second method to find viruses is to use a heuristic algorithm based on
common virus behaviors. This method has the ability to detect new viruses
for which anti-virus security firms have yet to define a "signature", but it
also gives rise to more false positives than using signatures. False
positives can be disruptive, especially in a commercial environment.
Recovery strategies and methods
One can also reduce the damage done by viruses by making regular backups of
data (and the operating systems) on different media, that are either kept
unconnected to the system (most of the time), read-only or not accessible
for other reasons, such as using different file systems. This way, if data
is lost through a virus, one can start again using the backup (which will
hopefully be recent).
If a backup session on optical media like CD and DVD is closed, it becomes
read-only and can no longer be affected by a virus (so long as a virus or
infected file was not copied onto the CD/DVD). Likewise, an operating system
on a bootable CD can be used to start the computer if the installed
operating systems become unusable. Backups on removable media must be
carefully inspected before restoration. The Gammima virus, for example,
propagates via removable flash drives.
Virus removal
Many websites run by antivirus software companies provide free online virus
scanning, with limited cleaning facilities (the purpose of the sites is to
sell anti-virus products). Some websites—like Google subsidiary
VirusTotal.com—allow users to upload one or more suspicious files to be
scanned and checked by one or more antivirus programs in one operation.
Additionally, several capable antivirus software programs are available for
free download from the Internet (usually restricted to non-commercial use).
Microsoft offers an optional free antivirus utility called Microsoft
Security Essentials, a Windows Malicious Software Removal Tool that is
updated as part of the regular Windows update regime, and an older optional
anti-malware (malware removal) tool Windows Defender that has been upgraded
to an antivirus product in Windows 8.
Some viruses disable System Restore and other important Windows tools such
as Task Manager and Command Prompt. An example of a virus that does this is
CiaDoor. Many such viruses can be removed by rebooting the computer,
entering Windows safe mode with networking, and then using system tools or
Microsoft Safety Scanner. System Restore on Windows Me, Windows XP, Windows
Vista and Windows 7 can restore the registry and critical system files to a
previous checkpoint. Often a virus will cause a system to hang, and a
subsequent hard reboot will render a system restore point from the same day
corrupt. Restore points from previous days should work provided the virus is
not designed to corrupt the restore files and does not exist in previous
restore points.
Operating system reinstallation
Microsoft's System File Checker (improved in Windows 7 and later) can be
used to check for, and repair, corrupted system files.
Restoring an earlier "clean" (virus-free) copy of the entire partition from
a cloned disk, a disk image, or a backup copy is one solution—restoring an
earlier backup disk image is relatively simple to do, usually removes any
malware, and may be faster than disinfecting the computer—or reinstalling
and reconfiguring the operating system and programs from scratch, as
described below, then restoring user preferences.
Reinstalling the operating system is another approach to virus removal. It
may be possible to recover copies of essential user data by booting from a
live CD, or connecting the hard drive to another computer and booting from
the second computer's operating system, taking great care not to infect that
computer by executing any infected programs on the original drive. The
original hard drive can then be reformatted and the OS and all programs
installed from original media. Once the system has been restored,
precautions must be taken to avoid reinfection from any restored executable
files.
Historical development
Early academic work on self-replicating programs
The first academic work on the theory of self-replicating computer programs
was done in 1949 by John von Neumann who gave lectures at the University of
Illinois about the "Theory and Organization of Complicated Automata". The
work of von Neumann was later published as the "Theory of self-reproducing
automata". In his essay von Neumann described how a computer program could
be designed to reproduce itself. Von Neumann's design for a self-reproducing
computer program is considered the world's first computer virus, and he is
considered to be the theoretical father of computer virology.
In 1972 Veith Risak, directly building on von Neumann's work on
self-replication, published his article "Selbstreproduzierende Automaten mit
minimaler Informationsübertragung" (Self-reproducing automata with minimal
information exchange). The article describes a fully functional virus
written in assembler language for a SIEMENS 4004/35 computer system.
In 1980 Jürgen Kraus wrote his diplom thesis "Selbstreproduktion bei
Programmen" (Self-reproduction of programs) at the University of Dortmund.
In his work Kraus postulated that computer programs can behave in a way
similar to biological viruses.
The first computer viruses
The Creeper virus was first detected on ARPANET, the forerunner of the
Internet, in the early 1970s. Creeper was an experimental self-replicating
program written by Bob Thomas at BBN Technologies in 1971. Creeper used the
ARPANET to infect DEC PDP-10 computers running the TENEX operating system.
Creeper gained access via the ARPANET and copied itself to the remote system
where the message, "I'm the creeper, catch me if you can!" was displayed.
The Reaper program was created to delete Creeper.
In 1982, a program called "Elk Cloner" was the first personal computer virus
to appear "in the wild"—that is, outside the single computer or lab where it
was created. Written in 1981 by Richard Skrenta, it attached itself to the
Apple DOS 3.3 operating system and spread via floppy disk. This virus,
created as a practical joke when Skrenta was still in high school, was
injected in a game on a floppy disk. On its 50th use the Elk Cloner virus
would be activated, infecting the personal computer and displaying a short
poem beginning "Elk Cloner: The program with a personality."
In 1984 Fred Cohen from the University of Southern California wrote his
paper "Computer Viruses – Theory and Experiments". It was the first paper to
explicitly call a self-reproducing program a "virus", a term introduced by
Cohen's mentor Leonard Adleman. In 1987, Fred Cohen published a
demonstration that there is no algorithm that can perfectly detect all
possible viruses. Fred Cohen's theoretical compression virus was an example
of a virus which was not malware, but was putatively benevolent. However,
antivirus professionals do not accept the concept of benevolent viruses, as
any desired function can be implemented without involving a virus (automatic
compression, for instance, is available under the Windows operating system
at the choice of the user). Any virus will by definition make unauthorised
changes to a computer, which is undesirable even if no damage is done or
intended. On page one of Dr Solomon's Virus Encyclopaedia, the
undesirability of viruses, even those that do nothing but reproduce, is
thoroughly explained.
An article that describes "useful virus functionalities" was published by J.
B. Gunn under the title "Use of virus functions to provide a virtual APL
interpreter under user control" in 1984.
The first IBM PC virus in the wild was a boot sector virus dubbed (c)Brain,
created in 1986 by the Farooq Alvi Brothers in Lahore, Pakistan, reportedly
to deter piracy of the software they had written.
The first virus to specifically target Microsoft Windows, WinVir was
discovered in April 1992, two years after the release of Windows 3.0. The
virus did not contain any Windows API calls, instead relying on DOS
interrupts. A few years later, in February 1996, Australian hackers from the
virus-writing crew Boza created the VLAD virus, which was the first known
virus to target Windows 95. In late 1997 the encrypted, memory-resident
stealth virus Win32.Cabanas was released—the first known virus that targeted
Windows NT (it was also able to infect Windows 3.0 and Windows 9x hosts).
Even home computers were affected by viruses. The first one to appear on the
Commodore Amiga was a boot sector virus called SCA virus, which was detected
in November 1987.
Viruses and the Internet
Before computer networks became widespread, most viruses spread on removable
media, particularly floppy disks. In the early days of the personal
computer, many users regularly exchanged information and programs on
floppies. Some viruses spread by infecting programs stored on these disks,
while others installed themselves into the disk boot sector, ensuring that
they would be run when the user booted the computer from the disk, usually
inadvertently. Personal computers of the era would attempt to boot first
from a floppy if one had been left in the drive. Until floppy disks fell out
of use, this was the most successful infection strategy and boot sector
viruses were the most common in the wild for many years.
Traditional computer viruses emerged in the 1980s, driven by the spread of
personal computers and the resultant increase in BBS, modem use, and
software sharing. Bulletin board–driven software sharing contributed
directly to the spread of Trojan horse programs, and viruses were written to
infect popularly traded software. Shareware and bootleg software were
equally common vectors for viruses on BBSs. Viruses can increase their
chances of spreading to other computers by infecting files on a network file
system or a file system that is accessed by other computers.
Macro viruses have become common since the mid-1990s. Most of these viruses
are written in the scripting languages for Microsoft programs such as Word
and Excel and spread throughout Microsoft Office by infecting documents and
spreadsheets. Since Word and Excel were also available for Mac OS, most
could also spread to Macintosh computers. Although most of these viruses did
not have the ability to send infected email messages, those viruses which
did take advantage of the Microsoft Outlook COM interface.
Some old versions of Microsoft Word allow macros to replicate themselves
with additional blank lines. If two macro viruses simultaneously infect a
document, the combination of the two, if also self-replicating, can appear
as a "mating" of the two and would likely be detected as a virus unique from
the "parents".
A virus may also send a web address link as an instant message to all the
contacts on an infected machine. If the recipient, thinking the link is from
a friend (a trusted source) follows the link to the website, the virus
hosted at the site may be able to infect this new computer and continue
propagating. Viruses that spread using cross-site scripting were first reported in 2002, and were academically demonstrated in 2005. There have been multiple instances of the cross-site scripting viruses in the wild, exploiting websites such as MySpace and Yahoo! |