|
Return to Main Menu
Audit - Detect Network Intrusions
Anonymity & Privacy
ATM - Asynchronous Transfer
Biometrics
Business Continuity Planning
Cellular Communications
Computer Crime &
Investigations
Computer Hardware Tutorial
Corporate Violence in
Workplace
Crypto & Encryption - Part I
Crypto & Encryption - Part II
Crypto & Encryption - Part III
Disaster Recovery Planning
Downloads - - Public Domain
Downloads - Packet Storm
Downloads - Hacker Domain
Employment and Job
Opportunities
Ethics Law and Security Policy
Firewalls
Frame Relay Tutorials
FreeBSD - Berkeley Unix Clone
FreeBSD - OnlineBooks to Read
General Security Related Links
Hacking - How its done Guides
Hacked Web Sites
Information Warfare
Internet Telephony &
Protocols
Intrusion Detection Library
Investigations and Courtrooms
Java Security Resources
Jobs & Employment
Opportunities
Legal Resources - Legal Basics
Linux Resources - Basics
Linux Resources - Online Books
Mailing List - For Newsletters
Magazine Articles - SEARCHER
Magazine Store - CheapPrices
Military & Govt Security
Docs
Networking - Internet
Protocols
Novell Networking Security
Online Courses -Boost Your
Skills
Pager Hardware Reprogramming
Penetration Testing -Intrusions
Physical and Facility Security
Privacy & Anonymity on the
Net
Programming Tutorials
Protocols - Networking -
Internet
Resume and Interview Resources
Security Magazines Online
Security Reference Library I
Security Reference Library II
Security Policy Library
Security Standards &
Guidelines
Smart Cards
Telecommunication &
Internet
Telecommunications Tutorials
Threat Risk Assessments
Unix Security Resources
Unix Security Online Books
VPN's - Virtual Private
Networks
Virus Worms Trojans Hoaxs
Voice / IP Protocols and
Standards
WIN NT Assorted Files
WIN NT Security Files
WIN 2000 Operating System
Workplace Violence
Y2K Year 2000 Information
|
Linux and Networking Bibliographies
- Linux article references
There are approximately 200 complete or partial items listed in this
listing. 80% of these links lead to the complete work and about 20% lead to
identifying why you can locate the reference at your library or by using the Search
Engines to hunt them down. Unfortunately some of the more commercial web sites use
the partial chapters as bait to lure you into buying the hard copy of the text - rather
then putting the whole thing online and letting the consumer decide whether he/she prefers
the free online version or if they wish to purchase a hard copy.
Another problem with maintaining this listing is that web sites live and die faster then
fruit flies - so it is possible that links will be up one day and down the next.
SEARCHING THIS LIST - use your browsers built in
find feature
Iin order to best search this list for topics of your particular interest, use the built
in feature of YOUR web browser to search the page for keywords. On the IE
Explorer the FIND feature is located in the EDIT options at the top of your browser.
On the Netscape browser you will find the FIND option also in the EDIT options at
the top of your web browser.
Good luck... good hunting... and hopefully good reading !
Network Bibliography : Linux Networking
W. Almesberger, J. Salim and A. Kuznetsov, "Differentiated
Services on Linux," Internet Engineering Task Force, Jun. 1999.
Abstract: Recent Linux kernels offer a wide variety of traffic control
functions, which can be combined in a modular way. We have designed support for
Differentiated Services based on the existing traffic control elements, and we have
implemented new components where necessary. In this document we give a brief overview of
the structure of Linux traffic control, and we describe our prototype implementation in
more detail.
J. Brustoloni, "VPN Masquerade
Assist (VMA): An End-to-End Mechanism for Robust NAT Interoperation with IPSec's IKE and
ESP Tunnel Mode," Internet Engineering Task Force, Jul. 2000.
Abstract: Linux's NAT (Network Address Translator) implementation is called IP
Masquerade. Its
J. Salim and U. Ahmed, "Performance
Evaluation of Explicit Congestion Notification (ECN) in IP Networks," Internet
Engineering Task Force, Apr. 2000.
Abstract: This draft presents a performance study of the Explicit Congestion
Notification (ECN) mechanism in the TCP/IP protocol using our implementation on the Linux
Operating System. ECN is an end-to-end congestion avoidance mechanism proposed by [6] and
incorporated into RFC 2481[7]. We study the behavior of ECN for both bulk and
transactional transfers. Our experiments show that there is improvement in throughput over
NON ECN (TCP employing any of Reno, SACK/FACK or NewReno congestion control) in the case
of bulk transfers and substantial improvement for transactional transfers.
R. Jain, T. Raleigh, M. Bereschinsky and C. Graff, "Mobile IP with Location
Registers (MIP-LR)," Internet Engineering Task Force, Feb. 2000.
Abstract: This document is intended only to provide information to the Internet
community. The Mobile IP (MIP) protocol for IP version 4 provides continuous Internet
connectivity to mobile hosts, without requiring any changes to existing routers and
higher-layer applications. This document describes an alternative protocol, Mobile IP with
Location Registers (MIP-LR), where the sender first queries a database, called a Location
Register, to obtain the recipients current location. MIP-LR is designed for operation in
tactical military environments, enterprise environments or within logical administrative
domains, as it requires a sending host to be aware which hosts implement the MIP-LR
protocol. MIP-LR gives up the transparency of MIP for several benefits in the areas of
survivability, performance and interoperability. MIP-LR improves survivability for
situations where the mobile's home network is particularly vulnerable (e.g. in the forward
area of a battlefield), by allowing location registers to be placed and replicated outside
the home network. This document describes how replicated location registers can be managed
by Translation Server or Quorum schemes. In terms of performance, MIP-LR avoids triangle
routing and tunneling, and reduces the load on the home network as well as the home
agents. MIP-LR provides improved interoperability with protocols such as RSVP for
providing QoS guarantees. Finally, MIP-LR is interoperable with MIP, such that hosts which
implement only MIP can continue to operate as expected (provided the provisions required
by MIP, such as Home and Foreign Agents, are appropriately provided) without gaining any
of the benefits of MIP-LR. We have developed a working version of the MIP-LR protocol
running on Linux hosts.
Neil Spring, Maureen Chesire, Mark Berryman, Vivek Sahasranaman, Thomas Anderson and
Brian Bershad, "Receiver
Based Management of Low Bandwidth Access Links," in Proceedings of the
Conference on Computer Communications (IEEE Infocom), (Tel Aviv, Israel), Mar. 2000.
Abstract: In this paper, we describe a receiver based congestion control policy
that leverages TCP flow control mechanisms to prioritize mixed traffic loads across access
links. We manage queuing at the access link to: (1) improve the response time of
interactive network applications; (2) reduce congestion-related packet losses; while (3)
maintaining high throughput for bulk-transfer applications. Our policy controls queue
length by manipulating receive socket buffer sizes. We have implemented this solution in a
dynamically loadable Linux kernel module, and tested it over low bandwidth links. Our
approach yields a 7-fold improvement in packet latency over an unmodified system while
maintaining link utilization at 94%. In the common case, congestion-related packet losses
at the access link can be eliminated. Finally, by prioritizing short flows, we show that
our system reduces the time to download a complex web page during a large background
transfer by a factor of two.
Keywords: Congestion, admission and flow control; Traffic management and
control; Quality of service (general)
Werner Almesberger, "High-speed ATM networking on low-end computer systems,"
Laboratoire de R\'eseaux de Communication, Swiss Federal Institute of Technology Lausanne,
no. DI 95/147, 1995.
Keywords: ATM, Linux, woa
Michael Beck, Harald Bohme, Mirko Dziadzka, Ulrich Kunitz, Robert Magnus and Dirk
Verworner, "Linux Kernel Internals," Harlow, England, 1996.
Keywords: Unix; Linux; operating system internals
Werner Almesberger, "ATM on Linux,"
EPFL DI-LRC, EPFL, 1015 Lausanne, Switzerland, no. 96/181, Mar. 1996.
Andrew Tridgell, Paul Mackerras, David Sitsky and David Walsh, "AP/Linux - Initial
Implementation," Dept. of Computer Science, Australian National University, Canberra
0200 ACT, Australia, no. TR-CS-96-07, Jun. 1996.
Abstract: The AP1000+ is a distributed-memory parallel computer based on
SuperSPARC processors, which incorporates message-passing hardware which can be accessed
safely from user mode. We are in the process of porting the Linux kernel to this machine
and extending it to support execution of parallel programs. This report outlines the
motivation and background of this effort, and describes the current status and future
directions for the work. The reader may also refer to our WWW page at
http://cap.anu.edu.au/cap/projects/linux for up to date information on the progress of the
port.
Brendan D. McKay, "\textttautoson -- a distributed batch system for UNIX
workstation networks (version 1.3)," Dept. of Computer Science, Australian National
University, Canberra 0200 ACT, Australia, no. TR-CS-96-03, Mar. 1996.
Abstract: \texttt{autoson} is a tool for scheduling independent processes across
a network of UNIX workstations. It provides a type of distributed batch queue that enables
execution of a stream of processes in a flexible and convenient manner with minimum impact
on interactive users. \par{} \texttt{autoson} can run as a single-user private system, or
as a shared resource for multiple users. Support is provided for local networks consisting
of any mixture of hosts running Solaris One, Solaris Two, IRIX, OSF1, ULTRIX, HP-UX or
Linux. The only requirement is a common file-system.
Shantanu Goel and Dan Duchamp, "Linux Device
Driver Emulation in Mach," in USENIX 1996 Annual Technical Conference,
(San Diego, California), Jan 1996.
Abstract: We describe the design and performance of code added to the Mach
microkernel (Mach 4.0, version UK02p21) that permits one to build a Mach kernel that
includes unmodified Linux device drivers. We have written emulation code to support all
Linux 1.3.35 network and SCSI drivers for the ISA and PCI I/O buses. Emulation increases
latency, but very little. The degree depends on both device and operation, and varies from
2 microseconds for receiving small (60 byte) network packets up to 197 microseconds for
writing 16KB to an ISA SCSI device.
Kevin Lai and Mary Baker, "A Performance
Comparison of UNIX Operating Systems on the Pentium," in USENIX 1996 Annual
Technical Conference, (San Diego, California), pp. 265-277, Jan. 1996.
Abstract: This paper evaluates the performance of three popular versions of the
UNIX operating system on the x86 architecture: Linux, FreeBSD, and Solaris. We evaluate
the systems using freely available micro- and application benchmarks to characterize the
behavior of their operating system services. We evaluate the currently available major
releases of the systems ``as-is,'' without any performance tuning. Our results show that
the x86 operating systems and system libraries we tested fail to deliver the Pentium's
full memory write performance to applications. On small-file workloads, Linux is an order
of magnitude faster than the other systems. On networking software, FreeBSD provides two
to three times higher bandwidth than Linux. In general, Solaris performance usually lies
between that of the other two systems. Although each operating system out-performs the
others in some area, we conclude that no one system offers clearly better overall
performance. Other factors, such as extra features, ease of installation, or freely
available source code, are more convincing reasons for choosing a particular system.
Alan Cox, "Network Buffers And
Memory Management," Linux Journal, vol. 1, no. 30, Oct. 1996.
Michael K. Johnson, "The
Linux Kernel Hackers' Guide, version 0.7," , 1996.
Angelos D. Keromytis, John Ioannidis and Jonathan M. Smith, "Implementing
IPsec," in Proc. of Global Internet (Globecom), (Phoenix, Arizona), Nov.
1997.
Abstract: The IP Security protocols are sufficiently mature to benefit from
multiple independent implementations and worldwide deployment. Towards that goal, we
implemented the protocols for the BSD/OS, Linux, OpenBSD and NetBSD\footnote{The BSD/OS
and Linux versions were developed in Greece by John Ioannidis between Fall of 1995 and
Winter of 1997; the NetBSD port was done by Angelos Keromytis in December of 1996, also in
Greece. Development on the NetBSD port continues at the University of Pennsylvania. The
OpenBSD port was originally done by Angelos Keromytis and Niels Provos.}. While some
differences in the implementations exist due to the differences in underlying operating
system structures, the design philosophy is common. A radix tree, namely the one used by
the BSD code for routing purposes, is used to implement the policy engine; a transform
table switch is used to make addition of security transformations an easy process; a
lightweight kernel-user communication mechanism is used to pass key material and other
configuration information from user space to kernel space, and to report asynchronous
events such as requests for new keys from kernel space to a user-level keying daemon; and
two distinct ways of intercepting outgoing packets and applying the IPsec transformations
to them are employed. In this paper, the techniques used in our implementations are
explained, differences in approaches are analysed, and hints are given to potential future
implementors of new transforms.
Keywords: IP; security; encryption; authentication; network; implementation
Andrew Campbell and G. Coulson, "QoS
Adaptive Transports: Delivering Scalable Media to the Desk Top," IEEE Network,
vol. 11, no. 2, pp. 18--27, Mar. 1997.
Abstract: By trading off temporal and spatial quality with available bandwidth,
or manipulating the playout time of continuous media in response to variation in delay,
audio and video flows can be made to adapt to fluctuating network conditions with minimal
perceptual distortion. In this paper we describe the implementation of an adaptive
transport system that incorporates a QoS- oriented API and a range of QoS mechanisms that
best assist multimedia applications in adapting to fluctuations in the delivered network
QoS. The system, which is an instantiation of the transport and network layers of a QoS
architecture, is implemented in a multi-ATM switch network environment with Linux-based PC
end systems and continuous media file servers. A performance evaluation of the system
configured to support a Video-on-Demand application scenario is presented and discussed. A
novel aspect of the system is the implementation of a QoS adaptation algorithm which
allows applications to delegate to the transport system responsibility for augmenting or
reducing the perceptual quality of video and audio flows when network resource
availability increases or decreases, respectively.
Keywords: Adaptive services; scalable audio and video flows; multimedia
transport; dynamic QoS management; QoS architecture
D. S. Alexander, M. Shaw, S. M. Nettles and J. M. Smith, "Active Bridging," ACM
Computer Communication Review, Cannes, France, vol. 27, no. 4, pp. 101-111, Oct. 1997.
Abstract: Active networks accelerate network evolution by permitting the network
infrastructure to be programmable, on a per-user, per-packet, or other basis. This
programmability must be balanced against the safety and security needs inherent in shared
resources. This paper describes the design, implementation, and performance of a new type
of network element, and Active Bridge. The active bridge can be reprogrammed 'on the fly',
with loadable modules called switchlets. To demonstrate the use of the active property, we
incrementally extend what is initially a programmable buffered repeater with switchlets
into a self-learning bridge, and then a bridge supporting spanning tree algorithms. To
demonstrate the agility that active networking gives, we show how it is possible to
upgrade a network from an 'old' protocol to a 'new' protocol on-the-fly. Moreover, we are
able to take advantage of information unavailable to the implementors of either protocol
to validate the new protocol and fall back to the old protocol if an error is detected.
This shows that the Active Bridge can protect itself from some algorithmic failures in
loadable modules. Our approach to safety and security favors static checking and
prevention over dynamic checks when possible. We rely on strong type checking in the Caml
language for the loadable module infrastructure, and achieve respectable performance. The
prototype implementation on a Pentium-based HP Netserver LS running Linux with 100 Mbps
Ethernet LANS achieves ttcp throughput of 16 Mbps between two PCS running Linux, compared
with 76 Mbps unbridged. Measured frame rates are in the neighborhood of 1800 frames per
second.
Michael Hasenstein, "IP Network Address
Translation," Chemnitz University of Technology, Chemnitz, Germany, 1997.
Keywords: firewall; NAT; network address translation
Bob Gray, "PC hardware for
source code UNIX - the double-edged sword," ;login:, vol. 23, no. 3, pp.
30--39, Jun. 1998.
Abstract: The article explores many of the issues surrounding choosing PC
hardware and come up some specific recommendations focused on running source code UNIX.
Keywords: hardware; PC; UNIX; FreeBSD; Linux
William S. Marcus, Ilija Hadzic, Anthony J. McAuley and Jonathan M. Smith, "Protocol Boosters: Applying
Programmability to Network Infrastructures," IEEE Communications Magazine,
vol. 36, no. 10, pp. 79-83, Oct 1998.
Abstract: This article describes a novel methodology for protocol design, using
incremental construction of the protocol from elements called protocol boosters on an
as-needed basis. Protocol boosters are an adaptation technique that allows dynamic and
efficient protocol customization to heterogeneous environments. By design, the boosting
mechanism is under control of a policy, which determines when augmentation is required.
Thus, many portions of a protocol stack execute only as necessary, permitting significant
increases in performance relative to general-purpose protocols. Design principles for
protocol boosters are presented, as well as an example booster. Two implementation
platforms are described: (1) an augmented Linux operating system, which is freely
available to other researchers; and (2) a rapidly reprogrammable hardware prototype,
called the Programmable Protocol Processing Pipeline (P4), which is based on off-the-shelf
FPGA technology. Since protocol boosters are programmed functions and can be
network-resident, a programmable network infrastructure is necessary to exploit their full
capability. Thus, protocol boosters are an ideal application for an on-the-fly
programmable network infrastructure.
Douglas P. Ghormley, David Petrou, Steven H. Rodrigues and Thomas E. Anderson, "SLIC:
An Extensibility System for Commodity Operating Systems," in USENIX 1998
Annual Technical Conference, (New Orleans, Louisiana), Jun. 1998.
Abstract: Modern commodity operating systems are large and complex systems
developed over many years by large teams of programmers, containing hundreds of thousands
of lines of code. Consequently, it is extremely difficult to add significant new
functionality to these systems. In response to this problem, a number of recent research
projects have explored novel operating systemarchitectures to support untrusted
extensions, including SPIN, VINO, Exokernel, and Fluke. Unfortunately, these architectures
require substantial implementation effort and are not generally available in commodity
systems. In contrast, by leveraging the technique of interposition, we have designed and
implemented a prototype extension system called SLIC which requires only trivial operating
system changes. SLIC efficiently inserts trusted extension code into commodity operating
systems, enabling a large class of trusted extensions for existing commodity operating
systems such as Solaris and Linux, while retaining full compatibility with existing
application binaries. By interposing trusted extensions on existing kernel interfaces, our
solution enables extensions which are protected from malicious applications, are enforced
upon uncooperative applications, are compos-able with extensions from other third-party
sources, and can be developed at the user-level using state-of-the-art development tools.
We have used SLIC to implement and demon-strate a number of useful operating system
extensions, including a patch to fix a security hole described in a CERT advisory, a
simple encryption file system, and a restricted execution environment for arbitrary
untrusted binaries. Performance measurements of the SLIC prototype demonstrate a one-time
installation cost of 2-8 msec and a per-extension invocation overhead commensurate with a
procedure call.
Peter S. Magnusson, Fredrik Larsson, Andreas Moestedt, Bengt Werner, Fredrik Dahlgren,
Magnus Karlsson, Fredrik Lundholm, Jim Nilsson, Per Stenström and Hakan Grahn, "SimICS/sun4m:
A Virtual Workstation," in USENIX 1998 Annual Technical Conference, (New
Orleans, Louisiana), Jun. 1998.
Abstract: System level simulators allow computer architects and system software
designers to recreate an accurate and complete replica of the program behavior of a target
system, regardless of the availability, existence, or instrumentation support of such a
system. Applications include evaluation of architectural design alternatives as well as
software engineering tasks such as traditional debugging and performance tuning. We
present an implementation of a simulator acting as a virtual workstation fully compatible
with the sun4m architecture from Sun Microsystems. Built using the system-level SPARC V8
simulator SimICS, SimICS/sun4m models one or more SPARC V8 processors, supports
user-developed modules for data cache and instruction cache simulation and execution
pro-filing of all code, and provides a symbolic and performance debugging environment for
operating systems. SimICS/sun4m can boot unmodified operating systems, including Linux
2.0.30 and Solaris 2.6, directly from snapshots of disk partitions. To support essentially
arbitrary code, we implemented binary-compatible simulators for several devices, including
SCSI, console, interrupt, timers, EEPROM, and Ethernet. The Ethernet simulation hooks into
the host and allows the virtual workstation to appear on the local network with full
services available (NFS, NIS, rsh, etc). Ethernet and console traffic can be recorded for
future playback. The performance of SimICS/sun4m is sufficient to run realistic workloads,
such as the database benchmark TPC-D, scaling factor 1/100, or an interactive network
application such as Mozilla. The slowdown in relation to native hardware is in the range
of 25 to 75 (measured using SPECint95). We also demonstrate some applica-tions, including
modeling an 8-processor sun4m version (which does not exist), modeling future memory
hierarchies, and debugging an operating system.
Michael Hicks, Jonathan Moore, Scott Alexander, Carl Gunter and Scott Nettles, "PLANet: An Active Internetwork,"
in Proceedings of the Conference on Computer Communications (IEEE Infocom), (New
York), Mar. 1999.
Abstract: Active networking addresses the issue of slow network evolution by
making the network programmable, and therefore extensible. We have designed a system based
on a special purpose programming language, PLAN (Packet Language for Active Networks),
which allows exploration of several key dimensions of the active networking design space;
in particular, our system supports both programmable packets and dynamic server
extensions. We have used our system to build an active internetwork, PLANet, in which all
packets are PLAN programs and new services can be dynamically added to the network using a
combination of programmability features. PLANet is implemented in user-space using a
byte-code interpreted version of the Caml language running on Linux PCs and currently
supports Ethernet and IP as link layers. On 300~MHz Pentium-II's over 100~Mbps Ethernet,
PLANet routers can achieve 48~Mbps and can switch over 5000 packets per second. As an
illustration of the uses of programmability, we also present several experiments that
demonstrate how PLANet can support a variety of strategies for coping with congestion.
Keywords: Internet and experimental systems
Tzi-Cker Chiueh and Prashant Pradhan, "High Performance IP Routing
Table Lookup using CPU Caching," in Proceedings of the Conference on Computer
Communications (IEEE Infocom), (New York), Mar. 1999.
Abstract: Wire-speed IP (Internet Protocol) routers require very fast routing
table lookup for incoming IP packets. The routing table lookup operation is time consuming
because the part of an IP address used in the lookup, i.e., the network address portion,
is variable in length. This paper describes the routing table lookup algorithm used in a
cluster-based parallel IP router project called Suez. The innovative aspect of this
algorithm is its ability to use CPU caching hardware to perform routing table caching and
lookup directly by carefully mapping IP addresses to virtual addresses. By running a
detailed simulation model that incorporates the performance effects of the CPU memory
hierarchy against a packet trace collected from a major network router, we show that the
overall performance of the proposed algorithm can reach 87.87 million lookups per second
for a 500-MHz Alpha processor with a 16-KByte L1 cache and a 1-MByte L2 cache. This result
is one to two orders of magnitude faster than previously reported results on
software-based routing table lookup implementations. This paper also reports the
performance impacts of various architectural parameters in the proposed scheme and its
storage costs, together with the measurements of an implementation of the proposed scheme
on a Pentium-II machine running Linux.
Keywords: Routing & Multicasting
Dug Song and Matt Undy, "NFR
Performance Testing," Anzen Computing, Ann Arbor, Michigan, Feb. 1999.
Abstract: In this paper, we attempt to loosely characterize NFR performance on
various operating systems, comparing freely available and proprietary sniffing technology.
Our results should be helpful in determining appropriate hardware/software configurations
for successful NFR deployment.
Keywords: packet sniffing; packet filter; tcpdump; Linux; BSD; Solaris; NT
Yui-Wah Lee, Kwong-Sak Leung and Mahadev Satyanarayanan, "Operation-based
Update Propagation in a Mobile File System," in 1999 USENIX Annual Technical
Conference, (Montery, California, USA), Jun 1999.
Abstract: In this paper we describe a technique called operation-based update
propagation for efficiently transmitting updates to large files that have been modified on
a weakly connected client of a distributed file system. In this technique, modifications
are captured above the file-system layer at the client, shipped to a surrogate client that
is strongly connected to a server, re-executed at the surrogate, and the resulting files
transmitted from the surrogate to the server. If re-execution fails to produce a file
identical to the original, the system falls back to shipping the file from the client over
the slow network. We have implemented a prototype of this mechanism in the Coda File
System on Linux, and demonstrated performance improvements ranging from 40 percents to
nearly three orders of magnitude in reduced network traffic and elapsed time. We also
found a novel use of forward error correction in this context.
Erez Zadok, Ion Badulescu and Alex Shender, "Extending
File Systems Using Stackable Templates," in 1999 USENIX Annual Technical
Conference, (Montery, California, USA), Jun 1999.
Abstract: Extending file system functionality is not a new idea, but a desirable
one nonetheless[6,14,18]. In the several years since stackable file systems were first
proposed, only a handful are in use[12,19]. Impediments to writing new file systems
include the complexity of operating systems, the difficulty of writing kernel-based code,
the lack of a true stackable vnode interface[14], and the challenges of porting one file
system to another operating system. We advocate writing new stackable file systems as
kernel modules. As a starting point, we propose a portable, stackable template file system
we call Wrapfs (wrapper file system). Wrapfs is a canonical, minimal stackable file system
that can be used as a pattern across a wide range of operating systems and file systems.
Given Wrapfs, developers can add or modify only that which is necessary to achieve the
desired functionality. Wrapfs takes care of the rest, and frees developers from the
details of operating systems. Wrapfs templates exist for several common operating systems
(Solaris, Linux, and FreeBSD), thus alleviating portability concerns. Wrapfs can be ported
to any operating system with a vnode interface that provides a private data pointer for
each data structure used in the interface. The overhead imposed by Wrapfs is only 5-7%.
This paper describes the design and implementation of Wrapfs, explores portability issues,
and shows how the implementation was achieved without changing client file systems or
operating systems. We discuss several examples of file systems written using Wrapfs.
Theodore Ts'o, "Standalone
Device Drivers in Linux," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: Traditionally, Unix device drivers have been developed and distributed
inside the kernel. There are a number of good reasons why this method is the predominant
way most device drivers are distributed. First of all, it simplifies the packaging and
distribution issues for the device driver. Secondly, it makes it easier to make changes to
the interfaces between the kernel and the device drivers. However, this traditional scheme
has a number of major disadvantages. First of all, it means that each version of the
device driver is linked to a specific version of the kernel. So if a device is only
supported in a development release, a customer who might otherwise want to use a stable,
production release might be forced to use a bleeding-edge system. Alternatively, there may
be bugs present in the device driver as included in the stable release of the kernel, but
which are fixed in the development kernel. Moreover, including device drivers with the
kernel is not scalable in the long term. If Linux-like 1 systems are ever to be able to
support as many devices as various OS's from Redmond, Washington, hardware manufacturers
will have to be able to support and distribute drivers separately from the main kernel
distribution. Currently, the size of the total Linux kernel source base has been doubling
(roughly) every 18 months. Almost all of this growth has been the result of new device
drivers. If Linux is to be successful at Linus Torvald's goal of Total World Domination,
it will be essential that we remove any limits to growth, and an exponential growth in the
size of the Linux kernel is an obviously a long-term growth limitation.
Peter J. Braam, Michael J. Callahan, M. Satyanarayanan and Marc Schnieder, "Porting
the Coda File System to Windows," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: We first describe how the Coda distributed filesystem was ported to
Windows 95 and 98. Coda consists of user level cache managers and servers and kernel level
code for filesystem support. Severe reentrancy difficulties in the Win32 environment on
this platform were overcome by extending the DJGPP DOS C compiler package with kernel
level support for sockets and more flexible memory management. With this support library
and kernel modules for Windows 9x filesystems in place, the Coda file system client could
be ported with very little patching and will likely soon run as well on Windows 9x as on
Linux. We ported Coda file servers to Windows NT. For fileservers the Cygwin32 kit was
used. We will not report here on the port of the Coda client to Windows NT, which is in an
early stage. In both cases cross compilation from a Linux environment was most helpful to
get a good development environment.
Kenneth Preslan, Matthew O'Keefe and John Lekashman, "The
Global File System: A Shared Disk File System for *BSD and Linux," in 1999
USENIX Annual Technical Conference, (Montery, California, USA), Jun 1999.
Donald K. Rosenberg, "Business
Issues in Free Software Licensing," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: There are some odd ideas circulating about Linux and the Free Software
or Open Source movements. We can call these ideas primitive because they are simplistic
and not well thought-out, and because they go back to the reductio ad absurdum of the
primitive peoples who believed (and still believe, we hear) in what anthropologists like
to talk about as the ``cargo cults.'' According to the anthropologists, these movements
began among South Seas peoples in the 19th century, when they awaited the arrival of large
ships which would restore to them all the wonderful goods their peoples had owned once,
long ago. After World War II these cults took the form of waiting for aircraft to descend
from the skies with their abundant cargoes. We are told that the believers even
constructed runways with mock aircraft on them, hoping to attract the passing air traffic.
We all smile--how much more we know than they--but today there is a firm body of thought
on the one hand that eventually all software in the future will be produced, shared, and
enjoyed on the Bazaar model: freely developed and given away by loosely-organized
programmers around the world, and superior in quality and design to the commercial
products of today. Given the behavior of much modern commercial software, one can
understand why the believers hope so fervently for the millennium. But commercial vendors
are just as likely to make the same mistake: we see articles and columnists hyping the
idea that if a software firm can just take the leap of faith into arms of Open Source,
they will attract legions of the world's smartest programmers, working ceaselessly and
without compensation to improve the code the vendor has thrown among them. It does not
help that any announcement that a company is releasing source code is regarded by the
business community as a desperate act of last resort. What should be the approach of a
commercial software vendor to the Open Source space? And what do they really want, anyway?
Craig Metz, "Porting
Kernel Code to Four BSDs and Linux," in 1999 USENIX Annual Technical
Conference, (Montery, California, USA), Jun 1999.
Abstract: The U.S. Naval Research Laboratory develops and maintains a freely
available IPv6 and IP Security distribution. All of the software builds and runs on
BSD/OS, FreeBSD, NetBSD, and OpenBSD, and a growing portion of the software builds and
runs on Linux. Each of the four BSDs has evolved signicantly from their original
4.4BSD-Lite ancestor, and increasingly more of that evolution is along divergent paths.
Linux shares no signicant ancestry with the BSDs, but is still a POSIX system, which means
that many of the same high-level facilities are available even though their implementation
might be completely different. This paper discusses many of the differences and many of
the similarities we encountered in the internals of these systems. It also discusses the
techniques and glue software that we developed for isolating and abstracting the
differences so that we could build a significant base of system code that is portable
between all five systems.
T. Cortes R. Cervera and Y. Bercerra, "Improving
Application Performance Through Swap Compression," in 1999 USENIX Annual
Technical Conference, (Montery, California, USA), Jun 1999.
Abstract: There are many applications that use large amounts of memory. These
large applications take advantage of the swapping mechanism to run on the system as the
available physical memory is not enough for them to run. The same problem appears when we
try to run, on a laptop, the same applications we run on a desktop computer. These
applications will relay on the swapping mechanism as laptop computers usually have less
physical memory than desktop ones. Finally, multi-user environments tend to be very loaded
and their applications have to swap out part of their memory so that all applications can
run concurrently. In all these cases, the performance of the applications is much lower
than the one they would achieve if no swapping was needed. This happens because the
swapping mechanism has to access the disk to keep the pages that do not fit in memory. It
is clear that these applications, and the whole system, would benefit from a faster
swapping system. If we examine the same problem from a different point of view, we observe
that increasing the number of pages that fit in the swap space without increasing the
number of blocks in the swap partition would also be quite beneficial. We could run the
same applications on a laptop than on a desktop system. Remember that laptops also have
smaller disks if compared to desktop ones. This increase in swap space would also help
multi-user systems to avoid getting out of memory. Finally, out-of-core applications could
be programmed more easily as the global-memory restriction would not be so important. Now
a days it is quite normal to continue the office work at home. This usually means the use
of large applications on a Linux box. These large applications fit well in the office
machines but are too large to run efficiently on a smaller Linux box. In these cases, a
fast swapping mechanism would be very beneficial as those applications would run faster
and working at home would be less ``painful''. Furthermore, increasing the swap space at
no cost would allow these kind of users to run applications that would normally not fit in
their home machines. These performance and space problems have motivated this work and its
objectives. The first, and most important, objective is to speedup the swap mechanism.
This will increase the performance of the applications that, for whatever reason, have to
keep part of their memory in the swap space. It is also an objective of this paper to
increase the size of the memory offered to the applications without increasing the number
of disk blocks in the swap partition. It is important to notice that should these two
objectives be in conflict, we will favor performance over capacity. Finally, we want to
achieve both improvements with the minimum number of changes in the original Linux kernel.
The main idea used to accomplish both objectives consists of compressing the pages that
have to be swapped out. This will increase the number of pages that can be placed in the
swap partition. Furthermore, it will also allow us to build a cache of compressed pages
that will decrease the number of times the system has to access the swap device. It is
important to notice that previous studies show that good compression ratios can be
achieved when compressing memory pages. The idea we present in this paper is similar, in
essence, to the one proposed by Douglis, but some improvements and modifications have been
done (see Section 5. We believe that now is a good time to reevaluate the results obtained
in this previous work as the technology has improved significantly which means that
compressing and decompressing pages can be done much more efficiently. This paper is
divided into 6 sections. In Section 2, we describe the concepts and ideas in which this
work has been based. In this section, we also present some preliminary results that will
lead the final design. Section 3 gives a detailed overview of the way the mechanism works.
Section 4 presents the benchmarks used and the results obtained while running them on our
system. In Section 5, we present the most significant work already done in the area.
Finally, Section 6 presents the main conclusions that can be extracted from this paper.
Peter Salus, "UNIX
to Linux in Perspective," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Rolf Riesen, Ron Brightwell, Lee Ann Fisk, Tramm Hudson, Jim Otto and Arthur B.
Maccabe, "Cplant,"
in 1999 USENIX Annual Technical Conference, (Montery, California, USA), Jun 1999.
Abstract: The Computational Plant project at Sandia National Laboratories is
developing a large-scale, massively parallel computing resource from a cluster of
commodity computing and networking components. We are combining the knowledge and research
of previous and ongoing commodity cluster projects with our expertise in designing,
developing, using, and maintaining large-scale MPP machines. This paper describes the main
parts of the architecture and discusses the most important design choices and decisions.
Scaling to hundreds and thousands of nodes requires more than simply combining readily-
available software and hardware. We will highlight some of the more crucial pieces that
make Cplant scalable.
Bob Felderman, "The Next
Generation: GM, LANai7 and 64-bit PCI," in 1999 USENIX Annual Technical
Conference, (Montery, California, USA), Jun 1999.
Abstract: Myricom has been shipping GM software for the past several months.
This paper will describe the current status of the software, some performance results and
expectations for software deliveries on new platforms and new Myricom hardware.
David M. Halstead, Brett Bode, Dave Turner and Vasily Lewis, "Giga-Plant
Scalable Cluster," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
Abstract: The Giga-Plant is a next generation compute cluster architecture under
construction within the Scalable Computing Laboratory (SCL). This work describes the
general cluster design philosophy utilized on this machine and others, and illustrates the
performance evaluation process that was exercised in order to make an informed hardware
purchasing decision. We present network communication throughput results taken from
several hardware platforms using Fast Ethernet and Gigabit Ethernet with and without Jumbo
Frames. We show that, despite throughput in excess of 800Mbit/s, communication latency is
the critical factor in determining the viability of commodity network hardware in parallel
processing applications.
Yutaka Ishikawa, Atsushi Hori, Hiroshi Tezuka, Shinji Sumimoto, Toshiyuki Takahashi and
Hiroshi Harada, "RWCP PC Cluster
Programming Environment," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: We have been developing the SCore cluster system software running on
top of Linux. As shown in Figure 1, the SCore System software consists of a global
operating system called SCore-D, a communication facility called PM, MPI implemented on PM
called MPICH-PM/CLUMP, a software distributed shared memory system called SCASH, and a
multithreaded programming language called MPC++. To realize the high performance system
using commodity hardware and software, the following key technologies have been employed:
A user-level zero-copy message transfer mechanism between nodes and one copy message
transfer mechanism within a SMP node by a high performance communication facility called
PM, A high performance MPI implementation called MPICH-PM/CLUMP that integrates both
zero-copy message transfer and messagepassing facilities in order to maximize performance,
and A multi-user environment using gang scheduling without degrading the
communication performance realized by an operating system daemon called SCore-D.
Henri E. Bal, Aske Plaat, Thilo Kielmann, Jason Maassen, Rob van Nieuwpoort and Ronald
Veldema, "Parallel
Computing on Wide-Area Clusters: the Albatross Project," in 1999 USENIX Annual
Technical Conference, (Montery, California, USA), Jun 1999.
Abstract: The aim of the Albatross project is to study applications and
programming environments for wide-area cluster computers, which consist of multiple
clusters connected by wide-area networks. Parallel processing on such systems is useful
but challenging, given the large differences in latency and bandwidth between LANs and
WANs. We apply application- level optimizations that exploit the hierarchical structure of
wide-area clusters to minimize communication over the WANs. In addition, we use highly
efficient local-area communication protocols. We illustrate this approach using a
high-performance Java system that is implemented on a collection of four Myrinet-based
clusters connected by wide-area ATM networks. The optimized applications obtain high
speedups on this wide- area system.
Sameer Shende, "Profiling
and Tracing in Linux," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: Profiling and tracing tools can help make application parallelization
more effective and identify performance bottlenecks. Profiling presents summary statistics
of performance metrics while tracing highlights the temporal aspect of performance
variations, showing when and where in the code performance is achieved. A complex
challenge is the mapping of performance data gathered during execution to high-level
parallel language constructs in the application source code. Presenting performance data
in a meaningful way to the user is equally important. This paper presents a brief overview
of profiling and tracing tools in the context of Linux - the operating system most
commonly used to build clusters of workstations for high performance computing.
James Ahrens, "Using Linux
Clusters for Parallel Visualization and Rendering," in 1999 USENIX Annual
Technical Conference, (Montery, California, USA), Jun 1999.
Peter J. Braam, "File Systems
for Clusters from a Protocol Perspective," in 1999 USENIX Annual Technical
Conference, (Montery, California, USA), Jun 1999.
Abstract: The protocols used by distributed file systems vary widely. The aim of
this talk is to give an overview of these protocols and discuss their applicability for a
cluster environment. File systems like NFS have weak semantics, making tight sharing
difficult. AFS, Coda and InterMezzo give a great deal of autonomy to cluster members, and
involve a persistent file cache for each system. True cluster file systems such as found
in VMS VAXClusters, XFS, GFS introduce a shared single image, but introduce complex
dependencies on cluster membership.
Kenneth W. Preslan, Manish Agarwal, Andrew P. Barry, Jonathan E. Brassow, Russell
Cattelan, Grant M. Erickson, Erling Nygaard, Seth Van Oort, Christopher J. Sabol, Steven
R. Soltis, David C. Teigland, Mike Tilstra and Matthew T. OKeefe, "A 64-bit, Shared
Disk File System for Linux," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: In computer systems today, speed and responsiveness is often
determined by network and storage subsystem performance. Faster, more scalable networking
interfaces like Fibre Channel and Gigabit Ethernet provide the scaf-folding from which
higher performance implementations may be constructed, but new thinking is required about
how machines interact with network-enabled storage devices. We have developed a Linux file
system called GFS (the Global File System) that allows multiple Linux machines to access
and share disk and tape devices on a Fibre Channel or SCSI storage network. We plan to
extend GFS by transporting packetized SCSI commands over IP so that any GFS-enabled Linux
machine can access shared network devices. GFS will perform well as a local file system,
as a traditional network file system running over IP, and as a high-performance cluster
file system running over storage networks like Fibre Channel. GFS device sharing provides
a key cluster-enabling technology for Linux, helping to bring the availability,
scalability, and load balancing benefits of clustering to Linux. Our goal is to develop a
scalable, (in number of clients and devices, capacity, connectivity, and bandwidth)
server-less file system that integrates IP-based network attached storage (NAS) and
Fibre-Channel-based storage area networks (SAN). We call this new architecture Storage
Area InterNetworking (SAINT). It exploits the speed and device scalability of SAN
clusters, and provides the client scalability and network interoperability of NAS
appliances. Our Linux port shows that the GFS architecture is portable across different
platforms, and we are currently working on a port to FreeBSD. The GFS code is open source
(GPL) software freely available on the Internet at http://www.globalfilesystem.org.
Pat Hanrahan, "Multi-graphics:
Towards Scalable, Distributed Visualization," in 1999 USENIX Annual Technical
Conference, (Montery, California, USA), Jun 1999.
Walter B. Ligon and Robert B. Ross, "An Overview of
the Parallel Virtual File System," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: As the PC cluster has grown in popularity as a parallel computing
platform, the demand for system software for this platform has grown as well. One common
piece of system software available for many commercial parallel machines is the parallel
file system. Parallel file systems offer higher I/O performance than single disk or RAID
systems, provide users with a convenient and consistent name space across the parallel
machine, support physical distribution of data across multiple disks and network entities
(I/O nodes), and typically include additional I/O interfaces to support larger files and
control of file parameters. The Parallel Virtual File System (PVFS) Project is an effort
to provide a parallel file system for PC clusters. As a parallel file system, PVFS
provides a global name space, striping of data across multiple I/O nodes, and multiple
user interfaces. The system is implemented at the user level, so no kernel modifications
are necessary to install or run the system. All communication is performed using TCP/IP,
so no additional message passing libraries are needed, and support is included for using
existing binaries on PVFS files. This paper describes the key aspects of the PVFS system
and presents recent performance results on a 64 node Beowulf workstation. Conclusions are
drawn and areas of future work are discussed.
Rajeev Thakur, William Gropp and Ewing Lusk, "MPI-IO: A
Standard, Portable API for High-Performance Parallel I/O," in 1999 USENIX
Annual Technical Conference, (Montery, California, USA), Jun 1999.
Abstract: Although a standard API for message-passing, namely MPI, has existed
for a while, a similar standard API hasn't been available for performing I/O from parallel
programs. Most parallel le systems (and therefore parallel programs) simply use the Unix
I/O API as the interface for parallel I/O. The Unix API, although portable, is not
appropriate as an API for high-performance parallel I/O. With the Unix API, one cannot
express (in a single function) the kinds of accesses that are common in parallel programs,
namely, each process accessing a noncontiguous data set and a group of processes accessing
a file simultaneously. Without this information, the file system cannot perform certain
optimizations that can otherwise improve performance dramatically. Furthermore, many
vendors support vendor-specific extensions to the basic Unix API, and using any of these
extensions makes programs nonportable. A solution to these problems is MPI-IO, the I/O
interface defined in the MPI-2 standard [2].
A Modular High Performance Implementation of the Virtual Interface Architecture, "Patrick Bozeman,"
in 1999 USENIX Annual Technical Conference, (Montery, California, USA), Jun 1999.
Ian Foster, "The Beta
Grid: A National Infrastructure for Computer Systems Research," in 1999 USENIX
Annual Technical Conference, (Montery, California, USA), Jun 1999.
Joel L. Clinkenbeard, "Compaqs
High Performance Programming Environment for Linux," in 1999 USENIX Annual
Technical Conference, (Montery, California, USA), Jun 1999.
Greg Chesson, "SGI, Linux,
Scaleable Systems," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
Rick Stevens, "The
High Performance Computing, Extreme Linux, Open Source, and Systems Software Initiative,"
in 1999 USENIX Annual Technical Conference, (Montery, California, USA), Jun 1999.
Rémy Evard, "Chiba
City," in 1999 USENIX Annual Technical Conference, (Montery, California,
USA), Jun 1999.
William Humphrey and Susan Coghlan, "Using a
Linux Cluster for Linear Accelerator Modeling," in 1999 USENIX Annual
Technical Conference, (Montery, California, USA), Jun 1999.
Abstract: Large-scale scientific parallel computations, traditionally performed
on specialized, expensive supercomputers and massively- parallel-processing computers, are
being performed more and more on relatively inexpensive Linux-based clusters of computers.
The rapid improvement in the performance and memory capacity of commodity processors from
Intel and DEC, coupled with the stability and performance of the Linux operating system,
has made it possible to assemble networked clusters of PCs at a fraction of the cost
of supercomputers. Compute power is only part of the story, however; very large scientific
simulation codes often also require large memory and network bandwidth resources, and this
must be taken into account when developing a Linux cluster for these types of problems.
Researchers at Los Alamos National Laboratory, studying the dynamics of intense particle
beams in accelerators, have developed a parallel simulation code to model the motion of
charge particles through linear accelerator components, and have studied the performance
of this code on a 128-processor, 64-node cluster of PCs running Linux as well as on
a 128-node Origin 2000 SMP supercomputer. The Linux cluster includes two forms of
interprocessor networking: inexpensive 100-BT Ethernet, with a maximum per-node bandwidth
of 100 Mb/s, and Myrinet networking, with a maximum per-node bandwidth of 1 Gb/s. We
compare the time to compute a typical problem in parallel on the Linux cluster and on the
Origin 2000 platform, for increasing numbers of processors. We also compare the
price/performance of these systems, taking into account the cost of the networking
equipment when running across Myrinet vs. 100-BT, and the use of single- or dual-processor
PCs. For this hardware and application, the Linux-based PC cluster performs at
two-third to three-fourths of the performance of the Origin 2000 machine, but at much
lower total cost. Use of Myrinet hardware, it is seen, improves total performance, but the
added cost of the Myrinet network hardware does not always lead to better
price/performance value.
Garth A. Gibson, David F. Nagle, William Courtright II, Nat Lanza, Paul Mazaitis, Marc
Unangst and Jim Zelenka, "NASD
Scalable Storage Systems," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Abstract: The goal of CMU's Network-Attached Secure Disks (NASD) project is to
define the next era of storage system interfaces and architectures. To encourage industry
standardization of a compliant storage device/ subsystem interface, we are working closely
with the National Storage Industry Consortiums working group on network-attached
storage. Our experimental demonstration of the NASD interfaces value is device and
filesystem prototype software that delivers the scalability inherent in a NASD storage
architecture. To engage the academic community and to provide a reference implementation
for industry development, CMU is releasing its Linux and Digital UNIX ports of this
software. In this paper, we overview the NASD scalable storage architecture and the
code-base we are releasing for Linux.
Bill Nitzberg and Bob Henderson, "PBS on Extreme
Linux: An Overview," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
Abstract: PBS, the Portable Batch System, was developed at NASA Ames Research
Center as a successor to the widely-used Network Queuing System (NQS). PBS provides
sophisticated resource management facilities for a wide range of systems: POSIX
1003.2d Batch Environment Standard compliant separate, fully tunable, scheduler in
C (also supports tcl & BASL) - production schedulers available supporting priorities,
dedicated time, and dynamic backfilling production hardened and demonstrated
scalability - 256 processor IBM SP, 352 processor SGI Origin cluster, and 130 processor
Linux cluster both serial and parallel jobs are fully supported in a mixed
time-shared and space-shared environment fault isolation, crash recovery, and
support for logging and accounting. Although fully configurable, the typical PBS
installation on an Extreme Linux cluster consists of a few ``front-end'' nodes running:
the PBS client code, a single PBS server, and a single PBS scheduler. The rest of the
cluster (the ``back-end'' nodes) run the PBS MOM daemon, which actually executes the jobs.
A job (e.g., ``mpirun -np 4 ./hello'') is submitted along with a resource specification
(e.g., ``4 nodes with 256 MB for 10 minutes'') on one of the front-ends, handed to the
server, scheduled, run by the MOMs, and has output placed back on the front-end. Active
development work is continuing at NASA Ames and elsewhere, including: support for Kerberos
authentication, porting the Maui scheduler, providing a bi-directional inter-face to
Globus - allowing PBS jobs to be run via Globus, and adding advance reservation support.
In addition, a web client interface is planned. PBS is an ``open source'' package, and
runs on nearly every UNIX variant; there is cur-rently no support for Windows. An RPM of
the current release (version 2.1) is available via pbs.mrj.com.
Vincent Schuster, "Linux SMP and
Tool Support," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
Greg Lindahl and Luke Lonergan, "Supercomputing
on Linux Alpha," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
David Jackson, "Maui Scheduler
on Linux," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
Abstract: As Linux clusters continue to establish themselves as effective high
performance computing (HPC) platforms, system administrators are finding themselves in
need of advanced scheduling and resource management capabilities. The Maui scheduler was
designed as an HPC scheduler with advanced features such as aggressive backfill, fairshare
scheduling, multiple fairness policies, dynamic prioritization, dedicated consumable
resource tracking and enforcement, and a very extensive advance reservation system. These
features allow the scheduler administer to control and optimally utilize the system
resources available. This presentation discusses some of the features of the maui
scheduler as well as experiences using the maui scheduler on Linux clusters.
Jeffery A. Kuehn, "Ptools
Support for Linux," in 1999 USENIX Annual Technical Conference, (Montery,
California, USA), Jun 1999.
Kevin Buettner, "Metrowerks
Linux Technology Roadmap," in 1999 USENIX Annual Technical Conference,
(Montery, California, USA), Jun 1999.
Ariel Cohen, Sampath Rangarajan and Hamilton Slye, "On the
Performance of TCP Splicing for URL-aware Redirection," in 2nd USENIX
Symposium on Internet Technologies and Systems, (Boulder, Colorado, USA), Oct 1999.
Abstract: This paper describes the design, implementation and performance of a
layer-7 switch which supports URL-aware redirection of HTTP traffic. Currently, there are
several vendors who are beginning to announce the availability of such switches in the
market, but little or no implementation and performance information is available. We
discuss design issues pertaining to such switches through a prototype implementation of a
URL-aware switch in the Linux kernel, and analyze the performance of our implementation.
We investigate the use of TCP splicing as a mechanism for improving the performance of the
switch; we explore whether TCP splicing will benefit URL-aware redirection even though
HTTP connections, on average, are short-lived and transfer small amounts of data. Results
from our implementation show that TCP splicing does improve the performance of URL-aware
switches that handle short- lived HTTP connections. Our results also re-affirm earlier
findings that TCP splicing substantially improves the performance of any application-layer
proxy when large amounts of data are transferred through the splice.
Mark E. Crovella, Robert Frangioso and Mor Harchol-Balter, "Connection
Scheduling in Web Servers," in 2nd USENIX Symposium on Internet Technologies
and Systems, (Boulder, Colorado, USA), Oct 1999.
Abstract: Under high loads, a Web server may be servicing many hundreds of
connections concurrently. In traditional Web servers, the question of the order in which
concurrent connections are serviced has been left to the operating system. In this paper
we ask whether servers might provide better service by using non-traditional service
ordering. In particular, for the case when a Web server is serving static files, we
examine the costs and benefits of a policy that gives preferential service to short
connections. We start by assessing the scheduling behavior of a commonly used server
(Apache running on Linux) with respect to connection size and show that it does not appear
to provide preferential service to short connections. We then examine the potential
performance improvements of a policy that does favor short connections
(shortest-connection-first). We show that mean response time can be improved by factors of
four or five under shortest-connection-first, as compared to an (Apache-like)
size-independent policy. Finally we assess the costs of shortest- connection-first
scheduling in terms of unfairness (i.e., the degree to which long connections suffer). We
show that under shortest-connection- first scheduling, long connections pay very little
penalty. This surprising result can be understood as a consequence of heavy-tailed Web
server workloads, in which most connections are small, but most server load is due to the
few large connections. We support this explanation using analysis.
J. Hadi Salim and U. Ahmed, "Performance
Evaluation of Explicit Congestion Notification (ECN) in IP Networks," Internet
Engineering Task Force, no. 2884, Jul. 2000.
Abstract: This memo presents a performance study of the Explicit Congestion
Notification (ECN) mechanism in the TCP/IP protocol using our implementation on the Linux
Operating System. ECN is an end-to-end congestion avoidance mechanism proposed by S. Floyd
and incorporated into RFC 2481. We study the behavior of ECN for both bulk and
transactional transfers. Our experiments show that there is improvement in throughput over
NON ECN (TCP employing any of Reno, SACK/FACK or NewReno congestion control) in the case
of bulk transfers and substantial improvement for transactional transfers. A more complete
pdf version of this document is available at: http://www7.nortel.com:8080/CTL/ecnperf.pdf
This memo in its current revision is missing a lot of the visual representations and
experimental results found in the pdf version.
|