SERACH ENGINE

Tuesday, June 21, 2011

SERVER (COMPUTING)

In computing, the term server is used to refer to one of the following:
a computer program running to serve the needs or requests of other programs (referred to in this context as "clients") which may or may not be running on the same computer.
a physical computer dedicated to running one or more such services, to serve the needs of programs running on other computers on the same network.
a software/hardware system (i.e. a software service running on a dedicated computer) such as a database server, file server, mail server, or print server.
In computer networking, a server is a program that operates as a socket listener.[1] The term server is also often generalized to describe a host that is deployed to execute one or more such programs.[2]
A server computer is a computer, or series of computers, that link other computers or electronic devices together. They often provide essential services across a network, either to private users inside a large organization or to public users via the internet. For example, when you enter a query in a search engine, the query is sent from your computer over the internet to the servers that store all the relevant web pages. The results are sent back by the server to your computer.
Many servers have dedicated functionality such as web servers, print servers, and database servers. Enterprise servers are servers that are used in a business context.


Usage

Servers provide essential services across a network, either to private users inside a large organization or to public users via the Internet. For example, when you enter a query in a search engine, the query is sent from your computer over the internet to the servers that store all the relevant web pages. The results are sent back by the server to your computer.
The term server is used quite broadly in information technology. Despite the many server-branded products available (such as server versions of hardware, software or operating systems), in theory any computerised process that shares a resource to one or more client processes is a server. To illustrate this, take the common example of file sharing. While the existence of files on a machine does not classify it as a server, the mechanism which shares these files to clients by the operating system is the server.
Similarly, consider a web server application (such as the multiplatform "Apache HTTP Server"). This web server software can be run on any capable computer. For example, while a laptop or personal computer is not typically known as a server, they can in these situations fulfill the role of one, and hence be labelled as one. It is in this case that the machine's purpose as a web server classifies it in general as a server.
In the hardware sense, the word server typically designates computer models intended for hosting software applications under the heavy demand of a network environment. In this client–server configuration one or more machines, either a computer or a computer appliance, share information with each other with one acting as a host for the other.
While nearly any personal computer is capable of acting as a network server, a dedicated server will contain features making it more suitable for production environments. These features may include a faster CPU, increased high-performance RAM, and typically more than one large hard drive. More obvious distinctions include marked redundancy in power supplies, network connections, and even the servers themselves.
Between the 1990s and 2000s an increase in the use of dedicated hardware saw the advent of self-contained server appliances. One well-known product is the Google Search Appliance, a unit that combines hardware and software in an out-of-the-box packaging. Simpler examples of such appliances include switches, routers, gateways, and print server, all of which are available in a near plug-and-play configuration.
Modern operating systems such as Microsoft Windows or Linux distributions rightfully seem to be designed with a client–server architecture in mind. These operating systems attempt to abstract hardware, allowing a wide variety of software to work with components of the computer. In a sense, the operating system can be seen as serving hardware to the software, which in all but low-level programming languages must interact using an API.
These operating systems may be able to run programs in the background called either services or daemons. Such programs may wait in a sleep state for their necessity to become apparent, such as the aforementioned Apache HTTP Server software. Since any software that provides services can be called a server, modern personal computers can be seen as a forest of servers and clients operating in parallel.
The Internet itself is also a forest of servers and clients. Merely requesting a web page from a few kilometers away involves satisfying a stack of protocols that involve many examples of hardware and software servers. The least of these are the routers, modems, domain name servers, and various other servers necessary to provide us the world wide web.
[edit]Server hardware



A server rack seen from the rear
Hardware requirements for servers vary, depending on the server application. Absolute CPU speed is not usually as critical to a server as it is to a desktop machine[citation needed]. Servers' duties to provide service to many users over a network lead to different requirements like fast network connections and high I/O throughput. Since servers are usually accessed over a network, they may run in headless mode without a monitor or input device. Processes that are not needed for the server's function are not used. Many servers do not have a graphical user interface (GUI) as it is unnecessary and consumes resources that could be allocated elsewhere. Similarly, audio and USB interfaces may be omitted.
Servers often run for long periods without interruption and availability must often be very high, making hardware reliability and durability extremely important. Although servers can be built from commodity computer parts, mission-critical enterprise servers are ideally very fault tolerant and use specialized hardware with low failure rates in order to maximize uptime, for even a short-term failure can cost more than purchasing and installing the system. For example, it may take only a few minutes of down time at a national stock exchange to justify the expense of entirely replacing the system with something more reliable. Servers may incorporate faster, higher-capacity hard drives, larger computer fans or water cooling to help remove heat, and uninterruptible power supplies that ensure the servers continue to function in the event of a power failure. These components offer higher performance and reliability at a correspondingly higher price. Hardware redundancy—installing more than one instance of modules such as power supplies and hard disks arranged so that if one fails another is automatically available—is widely used. ECC memory devices that detect and correct errors are used; non-ECC memory is more likely to cause data corruption.[citation needed]
To increase reliability, most of the servers use memory with error detection and correction, redundant disks, redundant power supplies and so on. Such components are also frequently hot swappable, allowing technicians to replace them on the running server without shutting it down. To prevent overheating, servers often have more powerful fans. As servers are usually administered by qualified engineers, their operating systems are also more tuned for stability and performance than for user friendliness and ease of use, Linux taking noticeably larger percentage than for desktop computers.[citation needed]
As servers need stable power supply, good Internet access, increased security and are also noisy, it is usual to store them in dedicated server centers or special rooms. This requires to reduce power consumption as extra energy used generates more heat and the temperature in the room could exceed the acceptable limits. Normally server rooms are equipped with air conditioning devices. Server casings are usually flat and wide, adapted to store many devices next to each other in server rack. Unlike ordinary computers, servers usually can be configured, powered up and down or rebooted remotely, using out-of-band management.
Many servers take a long time for the hardware to start up and load the operating system. Servers often do extensive pre-boot memory testing and verification and startup of remote management services. The hard drive controllers then start up banks of drives sequentially, rather than all at once, so as not to overload the power supply with startup surges, and afterwards they initiate RAID system pre-checks for correct operation of redundancy. It is common for a machine to take several minutes to start up, but it may not need restarting for months or years.
[edit]Server operating systems

Server-oriented operating systems tend to have certain features in common that make them more suitable for the server environment, such as
GUI not available or optional
ability to reconfigure and update both hardware and software to some extent without restart,
advanced backup facilities to permit regular and frequent online backups of critical data,
transparent data transfer between different volumes or devices,
flexible and advanced networking capabilities,
automation capabilities such as daemons in UNIX and services in Windows, and
tight system security, with advanced user, resource, data, and memory protection.
Server-oriented operating systems can, in many cases, interact with hardware sensors to detect conditions such as overheating, processor and disk failure, and consequently alert an operator or take remedial measures itself.
Because servers must supply a restricted range of services to perhaps many users while a desktop computer must carry out a wide range of functions required by its user, the requirements of an operating system for a server are different from those of a desktop machine. While it is possible for an operating system to make a machine both provide services and respond quickly to the requirements of a user, it is usual to use different operating systems on servers and desktop machines. Some operating systems are supplied in both server and desktop versions with similar user interface.
The desktop versions of the Windows and Mac OS X operating systems are deployed on a minority of servers, as are some proprietary mainframe operating systems, such as z/OS. The dominant operating systems among servers are UNIX-based or open source kernel distributions, such as Linux (the kernel).[citation needed]
The rise of the microprocessor-based server was facilitated by the development of Unix to run on the x86 microprocessor architecture. The Microsoft Windows family of operating systems also runs on x86 hardware, and since Windows NT have been available in versions suitable for server use.
While the role of server and desktop operating systems remains distinct, improvements in the reliability of both hardware and operating systems have blurred the distinction between the two classes. Today, many desktop and server operating systems share similar code bases, differing mostly in configuration. The shift towards web applications and middleware platforms has also lessened the demand for specialist application servers.
[edit]Servers on the Internet

Almost the entire structure of the Internet is based upon a client–server model. High-level root nameservers, DNS servers, and routers direct the traffic on the internet. There are millions of servers connected to the Internet, running continuously throughout the world.
World Wide Web
Domain Name System
E-mail
FTP file transfer
Chat and instant messaging
Voice communication
Streaming audio and video
Online gaming
Database servers
Virtually every action taken by an ordinary Internet user requires one or more interactions with one or more servers.
There are also technologies that operate on an inter-server level. Other services do not use dedicated servers; for example peer-to-peer file sharing, some implementations of telephony (e.g. Skype), and supplying television programs to several users (e.g. Kontiki, SlingBox).
[edit]Energy consumption of servers

In 2010, servers were responsible for 2.5% of energy consumption in the United States. A further 2.5% of United States energy consumption was used by cooling systems required to cool the servers. It was estimated in 2010, that if trends continued, by 2020, servers would use more of the world's energy than air travel.[3]

SERVER DEFINITION

A server is a a software program, or the computer on which that program runs, that provides a specific kind of service to client software running on the same computer or other computers on a network.

The client-server model is an architecture (i.e., a system design) that divides processing between clients and servers that can run on the same machine or on different machines on the same network. It is a major element of modern operating system and network design.

The client provides the user interface, such as a GUI (graphical user interface), and performs some or all of the processing on requests it makes from the server, which maintains the data and processes the requests.

An example is a web server, which stores files related to web sites and serves (i.e., sends) them across the Internet to clients (i.e., web browsers) when requested by a user. By far the most popular web server program is Apache, which is claimed to host more than 68 percent of all web sites on the Internet.

As is the case with other server software, Apache can run on computers which are used for multiple purposes, such as ordinary desktop computers, as well as on dedicated hardware.

A file server is software, or hardware plus software, that is dedicated to storing files and making them accessible for reading and writing to clients (i.e., users) across a network. A print server is software or hardware that manages one or more printers. A network server manages network traffic. A name server maps user and computer names to machine addresses. A database server allows clients to interact with a database. An application server runs applications for clients.

A single computer can have multiple server software applications running on it. Also, it is possible for a computer to be both a client and a server simultaneously; this is accomplished by connecting to itself in the same way that a separate computer would.

Many large enterprises employ numerous dedicated server machines. A collection of servers in one location is commonly referred to as a server farm. If very heavy traffic is expected, load balancing is usually employed to distribute the requests among the various servers so that no single machine is overwhelmed.

Due to the continual demand for ever more powerful servers in ever decreasing spaces, higher density configurations have been developed. In particular, blade server incorporate a number of sets of server hardware, sometimes as many as nine, each housed inside a high-density module known as a blade, within the space typically occupied by a single computer.

Confusion often arises with regard to the use of the term server in the context of the X Window System, an extensively used and free client-server system for managing GUIs (but not for creating the GUI itself) on single computers and on networks of computers. The X server resides on each local computer (i.e., those used directly by users) instead of on a remote computer, where it provides access to computer input and output devices (e.g., monitors, keyboards and mice) and performs basic graphics functions. The X clients are the application programs, and they can run on either some other computer on the network and serve many machines containing X servers or they can run on the same machines as the X servers.

MICROSOFT SQL SERVER

Genesis
SQL Server Release History
Version Year Release Name Codename
1.0
(OS/2) 1989 SQL Server 1.0
(16bit) -
1.1
(OS/2) 1991 SQL Server 1.1
(16bit) -
4.21
(WinNT) 1993 SQL Server 4.21 SQLNT
6.0 1995 SQL Server 6.0 SQL95
6.5 1996 SQL Server 6.5 Hydra
7.0 1998 SQL Server 7.0 Sphinx
- 1999 SQL Server 7.0
OLAP Tools Plato
8.0 2000 SQL Server 2000 Shiloh
8.0 2003 SQL Server 2000
64-bit Edition Liberty
9.0 2005 SQL Server 2005 Yukon
10.0 2008 SQL Server 2008 Katmai
10.25 2010 SQL Azure Matrix (aka CloudDB)
10.5 2010 SQL Server 2008 R2 Kilimanjaro (aka KJ)
11.0 Denali
Prior to version 7.0 the code base for MS SQL Server was sold by Sybase SQL Server to Microsoft, and was Microsoft's entry to the enterprise-level database market, competing against Oracle, IBM, and, later, Sybase. Microsoft, Sybase and Ashton-Tate originally teamed up to create and market the first version named SQL Server 1.0 for OS/2 (about 1989) which was essentially the same as Sybase SQL Server 3.0 on Unix, VMS, etc. Microsoft SQL Server 4.2 was shipped around 1992 (available bundled with IBM OS/2 version 1.3). Later Microsoft SQL Server 4.21 for Windows NT was released at the same time as Windows NT 3.1. Microsoft SQL Server v6.0 was the first version designed for NT, and did not include any direction from Sybase.
About the time Windows NT was released, Sybase and Microsoft parted ways and each pursued their own design and marketing schemes. Microsoft negotiated exclusive rights to all versions of SQL Server written for Microsoft operating systems. Later, Sybase changed the name of its product to Adaptive Server Enterprise to avoid confusion with Microsoft SQL Server. Until 1994, Microsoft's SQL Server carried three Sybase copyright notices as an indication of its origin.
Since parting ways, several revisions have been done independently. SQL Server 7.0 was a rewrite from the legacy Sybase code. It was succeeded by SQL Server 2000, which was the first edition to be launched in a variant for the IA-64 architecture.
In the ten years since release of Microsoft's previous SQL Server product (SQL Server 2000), advancements have been made in performance, the client IDE tools, and several complementary systems that are packaged with SQL Server 2005. These include: an ETL tool (SQL Server Integration Services or SSIS), a Reporting Server, an OLAP and data mining server (Analysis Services), and several messaging technologies, specifically Service Broker and Notification Services.
[edit]SQL Server 2005
SQL Server 2005 (codename Yukon), released in October 2005, is the successor to SQL Server 2000. It included native support for managing XML data, in addition to relational data. For this purpose, it defined an xml data type that could be used either as a data type in database columns or as literals in queries. XML columns can be associated with XSD schemas; XML data being stored is verified against the schema. XML is converted to an internal binary data type before being stored in the database. Specialized indexing methods were made available for XML data. XML data is queried using XQuery; Common Language Runtime (CLR) integration was a main feature with this edition, enabling one to write SQL code as Managed Code by the CLR. SQL Server 2005 added some extensions to the T-SQL language to allow embedding XQuery queries in T-SQL. In addition, it also defines a new extension to XQuery, called XML DML, that allows query-based modifications to XML data. SQL Server 2005 also allows a database server to be exposed over web services using TDS packets encapsulated within SOAP (protocol) requests. When the data is accessed over web services, results are returned as XML.[1]
For relational data, T-SQL has been augmented with error handling features (try/catch) and support for recursive queries with CTEs (Common Table Expressions). SQL Server 2005 has also been enhanced with new indexing algorithms, syntax and better error recovery systems. Data pages are checksummed for better error resiliency, and optimistic concurrency support has been added for better performance. Permissions and access control have been made more granular and the query processor handles concurrent execution of queries in a more efficient way. Partitions on tables and indexes are supported natively, so scaling out a database onto a cluster is easier. SQL CLR was introduced with SQL Server 2005 to let it integrate with the .NET Framework.[2]
SQL Server 2005 introduced "MARS" (Multiple Active Results Sets), a method of allowing usage of database connections for multiple purposes.[3]
SQL Server 2005 introduced DMVs (Dynamic Management Views), which are specialized views and functions that return server state information that can be used to monitor the health of a server instance, diagnose problems, and tune performance.[4]
SQL Server 2005 introduced Database Mirroring, but it was not fully supported until the first Service Pack release (SP1). In the initial release (RTM) of SQL Server 2005, database mirroring was available, but unsupported. In order to implement database mirroring in the RTM version, you had to apply trace flag 1400 at startup.[5] Database mirroring is a high availability option that provides redundancy and failover capabilities at the database level. Failover can be performed manually or can be configured for automatic failover. Automatic failover requires a witness partner and an operating mode of synchronous (also known as high-safety or full safety).[6]
[edit]SQL Server 2008

This section may require cleanup to meet Wikipedia's quality standards. Please improve this section if you can. The talk page may contain suggestions. (April 2009)
The latest version of SQL Server, SQL Server 2008,[7][8] was released (RTM) on August 6, 2008[9] and aims to make data management self-tuning, self organizing, and self maintaining with the development of SQL Server Always On technologies, to provide near-zero downtime. SQL Server 2008 also includes support for structured and semi-structured data, including digital media formats for pictures, audio, video and other multimedia data. In current versions, such multimedia data can be stored as BLOBs (binary large objects), but they are generic bitstreams. Intrinsic awareness of multimedia data will allow specialized functions to be performed on them. According to Paul Flessner, senior Vice President, Server Applications, Microsoft Corp., SQL Server 2008 can be a data storage backend for different varieties of data: XML, email, time/calendar, file, document, spatial, etc as well as perform search, query, analysis, sharing, and synchronization across all data types.[8]
Other new data types include specialized date and time types and a Spatial data type for location-dependent data.[10] Better support for unstructured and semi-structured data is provided using the new FILESTREAM[11] data type, which can be used to reference any file stored on the file system.[12] Structured data and metadata about the file is stored in SQL Server database, whereas the unstructured component is stored in the file system. Such files can be accessed both via Win32 file handling APIs as well as via SQL Server using T-SQL; doing the latter accesses the file data as a BLOB. Backing up and restoring the database backs up or restores the referenced files as well.[13] SQL Server 2008 also natively supports hierarchical data, and includes T-SQL constructs to directly deal with them, without using recursive queries.[13]
The Full-Text Search functionality has been integrated with the database engine. According to a Microsoft technical article, this simplifies management and improves performance.[14]
Spatial data will be stored in two types. A "Flat Earth" (GEOMETRY or planar) data type represents geospatial data which has been projected from its native, spherical, coordinate system into a plane. A "Round Earth" data type (GEOGRAPHY) uses an ellipsoidal model in which the Earth is defined as a single continuous entity which does not suffer from the singularities such as the international dateline, poles, or map projection zone "edges". Approximately 70 methods are available to represent spatial operations for the Open Geospatial Consortium Simple Features for SQL, Version 1.1.[15]
SQL Server includes better compression features, which also helps in improving scalability.[16] It enhanced the indexing algorithms and introduced the notion of filtered indexes. It also includes Resource Governor that allows reserving resources for certain users or workflows. It also includes capabilities for transparent encryption of data (TDE) as well as compression of backups.[11] SQL Server 2008 supports the ADO.NET Entity Framework and the reporting tools, replication, and data definition will be built around the Entity Data Model.[17] SQL Server Reporting Services will gain charting capabilities from the integration of the data visualization products from Dundas Data Visualization, Inc., which was acquired by Microsoft.[18] On the management side, SQL Server 2008 includes the Declarative Management Framework which allows configuring policies and constraints, on the entire database or certain tables, declaratively.[10] The version of SQL Server Management Studio included with SQL Server 2008 supports IntelliSense for SQL queries against a SQL Server 2008 Database Engine.[19] SQL Server 2008 also makes the databases available via Windows PowerShell providers and management functionality available as Cmdlets, so that the server and all the running instances can be managed from Windows PowerShell.[20]
[edit]SQL Server 2008 R2
SQL Server 2008 R2 (formerly codenamed SQL Server "Kilimanjaro") was announced at TechEd 2009, and was released to manufacturing on April 21, 2010.[21] SQL Server 2008 R2 adds certain features to SQL Server 2008 including a master data management system branded as Master Data Services, a central management of master data entities and hierarchies. Also Multi Server Management, a centralized console to manage multiple SQL Server 2008 instances and services including relational databases, Reporting Services, Analysis Services & Integration Services.[22]
SQL Server 2008 R2 includes a number of new services,[23] including PowerPivot for Excel and SharePoint, Master Data Services, StreamInsight, Report Builder 3.0, Reporting Services Add-in for SharePoint, a Data-tier function in Visual Studio that enables packaging of tiered databases as part of an application, and a SQL Server Utility named UC (Utility Control Point), part of AMSM (Application and Multi-Server Management) that is used to manage multiple SQL Servers.[24]
[edit]Editions

Microsoft makes SQL Server available in multiple editions, with different feature sets and targeting different users. These editions are:[25][26]
SQL Server Compact Edition (SQL CE)
The compact edition is an embedded database engine. Unlike the other editions of SQL Server, the SQL CE engine is based on SQL Mobile (initially designed for use with hand-held devices) and does not share the same binaries. Due to its small size (1 MB DLL footprint), it has a markedly reduced feature set compared to the other editions. For example, it supports a subset of the standard data types, does not support stored procedures or Views or multiple-statement batches (among other limitations). It is limited to 4 GB maximum database size and cannot be run as a Windows service, Compact Edition must be hosted by the application using it. The 3.5 version includes considerable work that supports ADO.NET Synchronization Services.
SQL Server Datacenter Edition
SQL Server Developer Edition
SQL Server Developer Edition includes the same features as SQL Server Enterprise Edition, but is limited by the license to be only used as a development and test system, and not as production server. This edition is available to download by students free of charge as a part of Microsoft's DreamSpark program.
SQL Server 2005 Embedded Edition (SSEE)
SQL Server 2005 Embedded Edition is a specially configured named instance of the SQL Server Express database engine which can be accessed only by certain Windows Services.
SQL Server Enterprise Edition
SQL Server Enterprise Edition is the full-featured edition of SQL Server, including both the core database engine and add-on services, while including a range of tools for creating and managing a SQL Server cluster.
SQL Server Evaluation Edition
SQL Server Evaluation Edition, also known as the Trial Edition, has all the features of the Enterprise Edition, but is limited to 180 days, after which the tools will continue to run, but the server services will stop.[27]
SQL Server Express Edition
SQL Server Express Edition is a scaled down, free edition of SQL Server, which includes the core database engine. While there are no limitations on the number of databases or users supported, it is limited to using one processor, 1 GB memory and 4 GB database files (10 GB database files from SQL Server Express 2008 R2[28]). The entire database is stored in a single .mdf file, and thus making it suitable for XCOPY deployment. It is intended as a replacement for MSDE. Two additional editions provide a superset of features not in the original Express Edition. The first is SQL Server Express with Tools, which includes SQL Server Management Studio Basic. SQL Server Express with Advanced Services adds full-text search capability and reporting services.[29]
SQL Server Fast Track
SQL Server Fast Track is specifically for enterprise-scale data warehousing storage and business intelligence processing, and runs on reference-architecture hardware that is optimized for Fast Track.[30]
SQL Server Standard Edition
SQL Server Standard edition includes the core database engine, along with the stand-alone services. It differs from Enterprise edition in that it supports fewer active instances (number of nodes in a cluster) and does not include some high-availability functions such as hot-add memory (allowing memory to be added while the server is still running), and parallel indexes.
SQL Server Web Edition
SQL Server Web Edition is a low-TCO option for Web hosting.
SQL Server Workgroup Edition
SQL Server Workgroup Edition includes the core database functionality but does not include the additional services..
[edit]Architecture

[edit]Protocol layer
Protocol layer implements the external interface to SQL Server. All operations that can be invoked on SQL Server are communicated to it via a Microsoft-defined format, called Tabular Data Stream (TDS). TDS is an application layer protocol, used to transfer data between a database server and a client. Initially designed and developed by Sybase Inc. for their Sybase SQL Server relational database engine in 1984, and later by Microsoft in Microsoft SQL Server, TDS packets can be encased in other physical transport dependent protocols, including TCP/IP, Named pipes, and Shared memory. Consequently, access to SQL Server is available over these protocols. In addition, the SQL Server API is also exposed over web services.[26]
[edit]Data storage

The main unit of data storage is a database, which is a collection of tables with typed columns. SQL Server supports different data types, including primary types such as Integer, Float, Decimal, Char (including character strings), Varchar (variable length character strings), binary (for unstructured blobs of data), Text (for textual data) among others. The rounding of floats to integers uses either Symmetric Arithmetic Rounding or Symmetric Round Down (Fix) depending on arguments: SELECT Round(2.5, 0) gives 3.
Microsoft SQL Server also allows user-defined composite types (UDTs) to be defined and used. It also makes server statistics available as virtual tables and views (called Dynamic Management Views or DMVs). In addition to tables, a database can also contain other objects including views, stored procedures, indexes and constraints, along with a transaction log. A SQL Server database can contain a maximum of 231 objects, and can span multiple OS-level files with a maximum file size of 220 TB.[26] The data in the database are stored in primary data files with an extension .mdf. Secondary data files, identified with a .ndf extension, are used to store optional metadata. Log files are identified with the .ldf extension.[26]
Storage space allocated to a database is divided into sequentially numbered pages, each 8 KB in size. A page is the basic unit of I/O for SQL Server operations. A page is marked with a 96-byte header which stores metadata about the page including the page number, page type, free space on the page and the ID of the object that owns it. Page type defines the data contained in the page - data stored in the database, index, allocation map which holds information about how pages are allocated to tables and indexes, change map which holds information about the changes made to other pages since last backup or logging, or contain large data types such as image or text. While page is the basic unit of an I/O operation, space is actually managed in terms of an extent which consists of 8 pages. A database object can either span all 8 pages in an extent ("uniform extent") or share an extent with up to 7 more objects ("mixed extent"). A row in a database table cannot span more than one page, so is limited to 8 KB in size. However, if the data exceeds 8 KB and the row contains Varchar or Varbinary data, the data in those columns are moved to a new page (or possibly a sequence of pages, called an Allocation unit) and replaced with a pointer to the data.[31]
For physical storage of a table, its rows are divided into a series of partitions (numbered 1 to n). The partition size is user defined; by default all rows are in a single partition. A table is split into multiple partitions in order to spread a database over a cluster. Rows in each partition are stored in either B-tree or heap structure. If the table has an associated index to allow fast retrieval of rows, the rows are stored in-order according to their index values, with a B-tree providing the index. The data is in the leaf node of the leaves, and other nodes storing the index values for the leaf data reachable from the respective nodes. If the index is non-clustered, the rows are not sorted according to the index keys. An indexed view has the same storage structure as an indexed table. A table without an index is stored in an unordered heap structure. Both heaps and B-trees can span multiple allocation units.[32]
[edit]Buffer management
SQL Server buffers pages in RAM to minimize disc I/O. Any 8 KB page can be buffered in-memory, and the set of all pages currently buffered is called the buffer cache. The amount of memory available to SQL Server decides how many pages will be cached in memory. The buffer cache is managed by the Buffer Manager. Either reading from or writing to any page copies it to the buffer cache. Subsequent reads or writes are redirected to the in-memory copy, rather than the on-disc version. The page is updated on the disc by the Buffer Manager only if the in-memory cache has not been referenced for some time. While writing pages back to disc, asynchronous I/O is used whereby the I/O operation is done in a background thread so that other operations do not have to wait for the I/O operation to complete. Each page is written along with its checksum when it is written. When reading the page back, its checksum is computed again and matched with the stored version to ensure the page has not been damaged or tampered with in the meantime.[33]
[edit]Logging and Transaction
SQL Server ensures that any change to the data is ACID-compliant, i.e. it uses transactions to ensure that the database will always revert to a known consistent state on failure. Each transaction may consist of multiple SQL statements all of which will only make a permanent change to the database if the last statement in the transaction (a COMMIT statement) completes successfully. If the COMMIT successfully completes the transaction is safely on disk.
SQL Server implements transactions using a write-ahead log.
Any changes made to any page will update the in-memory cache of the page, simultaneously all the operations performed will be written to a log, along with the transaction ID which the operation was a part of. Each log entry is identified by an increasing Log Sequence Number (LSN) which is used to ensure that all changes are written to the data files. Also during a log restore it is used to check that no logs are duplicated or skipped. SQL Server requires that the log is written onto the disc before the data page is written back. It must also ensure that all operations in a transaction are written to the log before any COMMIT operation is reported as completed.
At a later point the server will checkpoint the database and ensure that all pages in the data files have the state of their contents synchronised to a point at or after the LSN that the checkpoint started. When completed the checkpoint marks that portion of the log file as complete and may free it (see Simple transaction logging vs Full transaction logging). This enables SQL Server to ensure integrity of the data, even if the system fails.
On failure the database log has to be replayed to ensure the data files are in a consistent state. All pages stored in the roll forward part of the log (not marked as completed) are rewritten to the database, when the end of the log is reached all open transactions are rolled back using the roll back portion of the log file.
The database engine usually checkpoints quite frequently. However, in a heavily loaded database this can have a significant performance impact. It is possible to reduce the frequency of checkpoints or disable them completely but the rollforward during a recovery will take much longer
[edit]Concurrency and locking
SQL Server allows multiple clients to use the same database concurrently. As such, it needs to control concurrent access to shared data, to ensure data integrity - when multiple clients update the same data, or clients attempt to read data that is in the process of being changed by another client. SQL Server provides two modes of concurrency control: pessimistic concurrency and optimistic concurrency. When pessimistic concurrency control is being used, SQL Server controls concurrent access by using locks. Locks can be either shared or exclusive. Exclusive lock grants the user exclusive access to the data - no other user can access the data as long as the lock is held. Shared locks are used when some data is being read - multiple users can read from data locked with a shared lock, but not acquire an exclusive lock. The latter would have to wait for all shared locks to be released. Locks can be applied on different levels of granularity - on entire tables, pages, or even on a per-row basis on tables. For indexes, it can either be on the entire index or on index leaves. The level of granularity to be used is defined on a per-database basis by the database administrator. While a fine grained locking system allows more users to use the table or index simultaneously, it requires more resources. So it does not automatically turn into higher performing solution. SQL Server also includes two more lightweight mutual exclusion solutions - latches and spinlocks - which are less robust than locks but are less resource intensive. SQL Server uses them for DMVs and other resources that are usually not busy. SQL Server also monitors all worker threads that acquire locks to ensure that they do not end up in deadlocks - in case they do, SQL Server takes remedial measures, which in many cases is to kill one of the threads entangled in a deadlock and rollback the transaction it started.[26] To implement locking, SQL Server contains the Lock Manager. The Lock Manager maintains an in-memory table that manages the database objects and locks, if any, on them along with other metadata about the lock. Access to any shared object is mediated by the lock manager, which either grants access to the resource or blocks it.
SQL Server also provides the optimistic concurrency control mechanism, which is similar to the multiversion concurrency control used in other databases. The mechanism allows a new version of a row to be created whenever the row is updated, as opposed to overwriting the row, i.e., a row is additionally identified by the ID of the transaction that created the version of the row. Both the old as well as the new versions of the row are stored and maintained, though the old versions are moved out of the database into a system database identified as Tempdb. When a row is in the process of being updated, any other requests are not blocked (unlike locking) but are executed on the older version of the row. If the other request is an update statement, it will result in two different versions of the rows - both of them will be stored by the database, identified by their respective transaction IDs.[26]
[edit]Data retrieval

The main mode of retrieving data from an SQL Server database is querying for it. The query is expressed using a variant of SQL called T-SQL, a dialect Microsoft SQL Server shares with Sybase SQL Server due to its legacy. The query declaratively specifies what is to be retrieved. It is processed by the query processor, which figures out the sequence of steps that will be necessary to retrieve the requested data. The sequence of actions necessary to execute a query is called a query plan. There might be multiple ways to process the same query. For example, for a query that contains a join statement and a select statement, executing join on both the tables and then executing select on the results would give the same result as selecting from each table and then executing the join, but result in different execution plans. In such case, SQL Server chooses the plan that is expected to yield the results in the shortest possible time. This is called query optimization and is performed by the query processor itself.[26]
SQL Server includes a cost-based query optimizer which tries to optimize on the cost, in terms of the resources it will take to execute the query. Given a query, then the query optimizer looks at the database schema, the database statistics and the system load at that time. It then decides which sequence to access the tables referred in the query, which sequence to execute the operations and what access method to be used to access the tables. For example, if the table has an associated index, whether the index should be used or not - if the index is on a column which is not unique for most of the columns (low "selectivity"), it might not be worthwhile to use the index to access the data. Finally, it decides whether to execute the query concurrently or not. While a concurrent execution is more costly in terms of total processor time, because the execution is actually split to different processors might mean it will execute faster. Once a query plan is generated for a query, it is temporarily cached. For further invocations of the same query, the cached plan is used. Unused plans are discarded after some time.[26][34]
SQL Server also allows stored procedures to be defined. Stored procedures are parameterized T-SQL queries, that are stored in the server itself (and not issued by the client application as is the case with general queries). Stored procedures can accept values sent by the client as input parameters, and send back results as output parameters. They can call defined functions, and other stored procedures, including the same stored procedure (up to a set number of times). They can be selectively provided access to. Unlike other queries, stored procedures have an associated name, which is used at runtime to resolve into the actual queries. Also because the code need not be sent from the client every time (as it can be accessed by name), it reduces network traffic and somewhat improves performance.[35] Execution plans for stored procedures are also cached as necessary.
[edit]SQL CLR
Main article: SQL CLR
Microsoft SQL Server 2005 includes a component named SQL CLR ("Common Language Runtime") via which it integrates with .NET Framework. Unlike most other applications that use .NET Framework, SQL Server itself hosts the .NET Framework runtime, i.e., memory, threading and resource management requirements of .NET Framework are satisfied by SQLOS itself, rather than the underlying Windows operating system. SQLOS provides deadlock detection and resolution services for .NET code as well. With SQL CLR, stored procedures and triggers can be written in any managed .NET language, including C# and VB.NET. Managed code can also be used to define UDT's (user defined types), which can persist in the database. Managed code is compiled to .NET assemblies and after being verified for type safety, registered at the database. After that, they can be invoked like any other procedure.[36] However, only a subset of the Base Class Library is available, when running code under SQL CLR. Most APIs relating to user interface functionality are not available.[36]
When writing code for SQL CLR, data stored in SQL Server databases can be accessed using the ADO.NET APIs like any other managed application that accesses SQL Server data. However, doing that creates a new database session, different from the one in which the code is executing. To avoid this, SQL Server provides some enhancements to the ADO.NET provider that allows the connection to be redirected to the same session which already hosts the running code. Such connections are called context connections and are set by setting context connection parameter to true in the connection string. SQL Server also provides several other enhancements to the ADO.NET API, including classes to work with tabular data or a single row of data as well as classes to work with internal metadata about the data stored in the database. It also provides access to the XML features in SQL Server, including XQuery support. These enhancements are also available in T-SQL Procedures in consequence of the introduction of the new XML Datatype (query,value,nodes functions).[37]
[edit]Services

SQL Server also includes an assortment of add-on services. While these are not essential for the operation of the database system, they provide value added services on top of the core database management system. These services either run as a part of some SQL Server component or out-of-process as Windows Service and presents their own API to control and interact with them.
[edit]Service Broker
Used inside an instance, it is used to provide an asynchronous programming environment. For cross instance applications, Service Broker communicates The Service Broker, which runs as a part of the database engine, provides a reliable messaging and message queuing platform for SQL Server applications. over TCP/IP and allows the different components to be synchronized together, via exchange of messages.[38]
[edit]Replication Services
SQL Server Replication Services are used by SQL Server to replicate and synchronize database objects, either in entirety or a subset of the objects present, across replication agents, which might be other database servers across the network, or database caches on the client side. Replication follows a publisher/subscriber model, i.e., the changes are sent out by one database server ("publisher") and are received by others ("subscribers"). SQL Server supports three different types of replication:[39]
Transaction replication
Each transaction made to the publisher database (master database) is synced out to subscribers, who update their databases with the transaction. Transactional replication synchronizes databases in near real time.[40]
Merge replication
Changes made at both the publisher and subscriber databases are tracked, and periodically the changes are synchronized bi-directionally between the publisher and the subscribers. If the same data has been modified differently in both the publisher and the subscriber databases, synchronization will result in a conflict which has to be resolved - either manually or by using pre-defined policies. rowguid needs to be configured on a column if merge replication is configured.[41]
Snapshot replication
Snapshot replication published a copy of the entire database (the then-snapshot of the data) and replicates out to the subscribers. Further changes to the snapshot are not tracked.[42]
[edit]Analysis Services
Main article: SQL Server Analysis Services
SQL Server Analysis Services adds OLAP and data mining capabilities for SQL Server databases. The OLAP engine supports MOLAP, ROLAP and HOLAP storage modes for data. Analysis Services supports the XML for Analysis standard as the underlying communication protocol. The cube data can be accessed using MDX queries.[43] Data mining specific functionality is exposed via the DMX query language. Analysis Services includes various algorithms - Decision trees, clustering algorithm, Naive Bayes algorithm, time series analysis, sequence clustering algorithm, linear and logistic regression analysis, and neural networks - for use in data mining.[44]
[edit]Reporting Services
Main article: SQL Server Reporting Services
SQL Server Reporting Services is a report generation environment for data gathered from SQL Server databases. It is administered via a web interface. Reporting services features a web services interface to support the development of custom reporting applications. Reports are created as RDL files.[45]
Reports can be designed using recent versions of Microsoft Visual Studio (Visual Studio.NET 2003, 2005, and 2008)[46] with Business Intelligence Development Studio, installed or with the included Report Builder. Once created, RDL files can be rendered in a variety of formats[47] including Excel, PDF, CSV, XML, TIFF (and other image formats),[48] and HTML Web Archive.
[edit]Notification Services
Main article: SQL Server Notification Services
Originally introduced as a post-release add-on for SQL Server 2000,[49] Notification Services was bundled as part of the Microsoft SQL Server platform for the first and only time with SQL Server 2005.[50][51] with Sql Server 2005, SQL Server Notification Services is a mechanism for generating data-driven notifications, which are sent to Notification Services subscribers. A subscriber registers for a specific event or transaction (which is registered on the database server as a trigger); when the event occurs, Notification Services can use one of three methods to send a message to the subscriber informing about the occurrence of the event. These methods include SMTP, SOAP, or by writing to a file in the filesystem.[52] Notification Services was discontinued by Microsoft with the release of SQL Server 2008 in August 2008, and is no longer an officially supported component of the SQL Server database platform.
[edit]Integration Services
Main article: SQL Server Integration Services
SQL Server Integration Services is used to integrate data from different data sources. It is used for the ETL capabilities for SQL Server for data warehousing needs. Integration Services includes GUI tools to build data extraction workflows integration various functionality such as extracting data from various sources, querying data, transforming data including aggregating, duplication and merging data, and then loading the transformed data onto other sources, or sending e-mails detailing the status of the operation as defined by the user.[53]
[edit]Full Text Search Service
Main article: SQL Server Full Text Search


The SQL Server Full Text Search service architecture
SQL Server Full Text Search service is a specialized indexing and querying service for unstructured text stored in SQL Server databases. The full text search index can be created on any column with character based text data. It allows for words to be searched for in the text columns. While it can be performed with the SQL LIKE operator, using SQL Server Full Text Search service can be more efficient. Full Text Search (FTS) allows for inexact matching of the source string, indicated by a Rank value which can range from 0 to 1000 - a higher rank means a more accurate match. It also allows linguistic matching ("inflectional search"), i.e., linguistic variants of a word (such as a verb in a different tense) will also be a match for a given word (but with a lower rank than an exact match). Proximity searches are also supported, i.e., if the words searched for do not occur in the sequence they are specified in the query but are near each other, they are also considered a match. T-SQL exposes special operators that can be used to access the FTS capabilities.[54][55]
The Full Text Search engine is divided into two processes - the Filter Daemon process (msftefd.exe) and the Search process (msftesql.exe). These processes interact with the SQL Server. The Search process includes the indexer (that creates the full text indexes) and the full text query processor. The indexer scans through text columns in the database. It can also index through binary columns, and use iFilters to extract meaningful text from the binary blob (for example, when a Microsoft Word document is stored as an unstructured binary file in a database). The iFilters are hosted by the Filter Daemon process. Once the text is extracted, the Filter Daemon process breaks it up into a sequence of words and hands it over to the indexer. The indexer filters out noise words, i.e., words like A, And etc., which occur frequently and are not useful for search. With the remaining words, an inverted index is created, associating each word with the columns they were found in. SQL Server itself includes a Gatherer component that monitors changes to tables and invokes the indexer in case of updates.[56]
When a full text query is received by the SQL Server query processor, it is handed over to the FTS query processor in the Search process. The FTS query processor breaks up the query into the constituent words, filters out the noise words, and uses an inbuilt thesaurus to find out the linguistic variants for each word. The words are then queried against the inverted index and a rank of their accurateness is computed. The results are returned to the client via the SQL Server process.[56]
[edit]Tools

[edit]SQLCMD
SQLCMD is a command line application that comes with Microsoft SQL Server, and exposes the management features of SQL Server. It allows SQL queries to be written and executed from the command prompt. It can also act as a scripting language to create and run a set of SQL statements as a script. Such scripts are stored as a .sql file, and are used either for management of databases or to create the database schema during the deployment of a database.
SQLCMD was introduced with SQL Server 2005 and this continues with SQL Server 2008. Its predecessor for earlier versions was OSQL and ISQL, which is functionally equivalent as it pertains to TSQL execution, and many of the command line parameters are identical, although SQLCMD adds extra versatility.
[edit]Visual Studio
Microsoft Visual Studio includes native support for data programming with Microsoft SQL Server. It can be used to write and debug code to be executed by SQL CLR. It also includes a data designer that can be used to graphically create, view or edit database schemas. Queries can be created either visually or using code. SSMS 2008 onwards, provides intellisense for SQL queries as well.
[edit]SQL Server Management Studio
SQL Server Management Studio is a GUI tool included with SQL Server 2005 and later for configuring, managing, and administering all components within Microsoft SQL Server. The tool includes both script editors and graphical tools that work with objects and features of the server.[57] SQL Server Management Studio replaces Enterprise Manager as the primary management interface for Microsoft SQL Server since SQL Server 2005. A version of SQL Server Management Studio is also available for SQL Server Express Edition, for which it is known as SQL Server Management Studio Express (SSMSE).[58]
A central feature of SQL Server Management Studio is the Object Explorer, which allows the user to browse, select, and act upon any of the objects within the server.[59] It can be used to visually observe and analyze query plans and optimize the database performance, among others.[60] SQL Server Management Studio can also be used to create a new database, alter any existing database schema by adding or modifying tables and indexes, or analyze performance. It includes the query windows which provide a GUI based interface to write and execute queries sam.[26]
[edit]Business Intelligence Development Studio
Business Intelligence Development Studio (BIDS) is the IDE from Microsoft used for developing data analysis and Business Intelligence solutions utilizing the Microsoft SQL Server Analysis Services, Reporting Services and Integration Services. It is based on the Microsoft Visual Studio development environment but customizes with the SQL Server services-specific extensions and project types, including tools, controls and projects for reports (using Reporting Services), Cubes and data mining structures (using Analysis Services).[61]
[edit]Programmability

[edit]T-SQL
Main article: T-SQL
T-SQL (Transact-SQL) is the primary means of programming and managing SQL Server. It exposes keywords for the operations that can be performed on SQL Server, including creating and altering database schemas, entering and editing data in the database as well as monitoring and managing the server itself. Client applications that consume data or manage the server will leverage SQL Server functionality by sending T-SQL queries and statements which are then processed by the server and results (or errors) returned to the client application. SQL Server allows it to be managed using T-SQL. For this it exposes read-only tables from which server statistics can be read. Management functionality is exposed via system-defined stored procedures which can be invoked from T-SQL queries to perform the management operation. It is also possible to create linked Server using T-SQL. Linked server allows operation to multiple server as one query.[62]
[edit]SQL Native Client
SQL Native Client is the native client side data access library for Microsoft SQL Server, version 2005 onwards. It natively implements support for the SQL Server features including the Tabular Data Stream implementation, support for mirrored SQL Server databases, full support for all data types supported by SQL Server, asynchronous operations, query notifications, encryption support, as well as receiving multiple result sets in a single database session. SQL Native Client is used under the hood by SQL Server plug-ins for other data access technologies, including ADO or OLE DB. The SQL Native Client can also be directly used, bypassing the generic data access layers.[63]
[edit]

PROXY SERVER

n computer networks, a proxy server is a server (a computer system or an application) that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource, available from a different server. The proxy server evaluates the request according to its filtering rules. For example, it may filter traffic by IP address or protocol. If the request is validated by the filter, the proxy provides the resource by connecting to the relevant server and requesting the service on behalf of the client. A proxy server may optionally alter the client's request or the server's response, and sometimes it may serve the request without contacting the specified server. In this case, it 'caches' responses from the remote server, and returns subsequent requests for the same content directly.
Most proxies are a web proxy, allowing access to content on the World Wide Web.
A proxy server has a large variety of potential purposes, including:
To keep machines behind it anonymous (mainly for security).[1]
To speed up access to resources (using caching). Web proxies are commonly used to cache web pages from a web server.[2]
To apply access policy to network services or content, e.g. to block undesired sites.
To log / audit usage, i.e. to provide company employee Internet usage reporting.
To bypass security/ parental controls.
To scan transmitted content for malware before delivery.
To scan outbound content, e.g., for data leak protection.
To circumvent regional restrictions.
A proxy server that passes requests and replies unmodified is usually called a gateway or sometimes tunneling proxy.
A proxy server can be placed in the user's local computer or at various points between the user and the destination servers on the Internet.
A reverse proxy is (usually) an Internet-facing proxy used as a front-end to control and protect access to a server on a private network, commonly also performing tasks such as load-balancing, authentication, decryption or caching.


Types of proxy

[edit]Forward proxies


A forward proxy taking requests from an internal network and forwarding them to the Internet.
Forward proxies are proxies where the client server names the target server to connect to.[3] Forward proxies are able to retrieve from a wide range of sources (in most cases anywhere on the Internet).
The terms "forward proxy" and "forwarding proxy" are a general description of behaviour (forwarding traffic) and thus ambiguous. Except for Reverse proxy, the types of proxies described on this article are more specialized sub-types of the general forward proxy concept.
[edit]Open proxies


An open proxy forwarding requests from and to anywhere on the Internet.
Main article: Open proxy
An open proxy is a forwarding proxy server that is accessible by any Internet user. Gordon Lyon estimates there are "hundreds of thousands" of open proxies on the Internet.[4] An anonymous open proxy allows users to conceal their IP address while browsing the Web or using other Internet services.
[edit]Reverse proxies


A reverse proxy taking requests from the Internet and forwarding them to servers in an internal network. Those making requests connect to the proxy and may not be aware of the internal network.
Main article: Reverse proxy
A reverse proxy is a proxy server that appears to clients to be an ordinary server. Requests are forwarded to one or more origin servers which handle the request. The response is returned as if it came directly from the proxy server.[3]
Reverse proxies are installed in the neighborhood of one or more web servers. All traffic coming from the Internet and with a destination of one of the web servers goes through the proxy server. The use of "reverse" originates in its counterpart "forward proxy" since the reverse proxy sits closer to the web server and serves only a restricted set of websites.
There are several reasons for installing reverse proxy servers:
Encryption / SSL acceleration: when secure web sites are created, the SSL encryption is often not done by the web server itself, but by a reverse proxy that is equipped with SSL acceleration hardware. See Secure Sockets Layer. Furthermore, a host can provide a single "SSL proxy" to provide SSL encryption for an arbitrary number of hosts; removing the need for a separate SSL Server Certificate for each host, with the downside that all hosts behind the SSL proxy have to share a common DNS name or IP address for SSL connections. This problem can partly be overcome by using the SubjectAltName feature of X.509 certificates.
Load balancing: the reverse proxy can distribute the load to several web servers, each web server serving its own application area. In such a case, the reverse proxy may need to rewrite the URLs in each web page (translation from externally known URLs to the internal locations).
Serve/cache static content: A reverse proxy can offload the web servers by caching static content like pictures and other static graphical content.
Compression: the proxy server can optimize and compress the content to speed up the load time.
Spoon feeding: reduces resource usage caused by slow clients on the web servers by caching the content the web server sent and slowly "spoon feeding" it to the client. This especially benefits dynamically generated pages.
Security: the proxy server is an additional layer of defense and can protect against some OS and WebServer specific attacks. However, it does not provide any protection to attacks against the web application or service itself, which is generally considered the larger threat.
Extranet Publishing: a reverse proxy server facing the Internet can be used to communicate to a firewalled server internal to an organization, providing extranet access to some functions while keeping the servers behind the firewalls. If used in this way, security measures should be considered to protect the rest of your infrastructure in case this server is compromised, as its web application is exposed to attack from the Internet.
[edit]Uses of proxy servers

[edit]Filtering
Further information: Content-control software
A content-filtering web proxy server provides administrative control over the content that may be relayed through the proxy. It is commonly used in both commercial and non-commercial organizations (especially schools) to ensure that Internet usage conforms to acceptable use policy. In some cases users can circumvent the proxy, since there are services designed to proxy information from a filtered website through a non filtered site to allow it through the user's proxy.
A content filtering proxy will often support user authentication, to control web access. It also usually produces logs, either to give detailed information about the URLs accessed by specific users, or to monitor bandwidth usage statistics. It may also communicate to daemon-based and/or ICAP-based antivirus software to provide security against virus and other malware by scanning incoming content in real time before it enters the network.
Many work places, schools, and colleges restrict the web sites and online services that are made available in their buildings. This is done either with a specialized proxy, called a content filter (both commercial and free products are available), or by using a cache-extension protocol such as ICAP, that allows plug-in extensions to an open caching architecture.
Some common methods used for content filtering include: URL or DNS blacklists, URL regex filtering, MIME filtering, or content keyword filtering. Some products have been known to employ content analysis techniques to look for traits commonly used by certain types of content providers.
Requests made to the open internet must first pass through an outbound proxy filter. The web-filtering company provides a database of URL patterns (regular expressions) with associated content attributes. This database is updated weekly by site-wide subscription, much like a virus filter subscription. The administrator instructs the web filter to ban broad classes of content (such as sports, pornography, online shopping, gambling, or social networking). Requests that match a banned URL pattern are rejected immediately.
Assuming the requested URL is acceptable, the content is then fetched by the proxy. At this point a dynamic filter may be applied on the return path. For example, JPEG files could be blocked based on fleshtone matches, or language filters could dynamically detect unwanted language. If the content is rejected then an HTTP fetch error is returned and nothing is cached.
Extranet Publishing: a reverse proxy server facing the Internet can be used to communicate to a firewalled server internal to an organization, providing extranet access to some functions while keeping the servers behind the firewalls. If used in this way, security measures should be considered to protect the rest of your infrastructure in case this server is compromised, as its web application is exposed to attack from the Internet
Most web filtering companies use an internet-wide crawling robot that assesses the likelihood that a content is a certain type (e.g. "This content is 70% chance of porn, 40% chance of sports, and 30% chance of news" could be the outcome for one web page). The resultant database is then corrected by manual labor based on complaints or known flaws in the content-matching algorithms.
Web filtering proxies are not able to peer inside secure sockets HTTP transactions, assuming the chain-of-trust of SSL/TLS has not been tampered with. As a result, users wanting to bypass web filtering will typically search the internet for an open and anonymous HTTPS transparent proxy. They will then program their browser to proxy all requests through the web filter to this anonymous proxy. Those requests will be encrypted with https. The web filter cannot distinguish these transactions from, say, a legitimate access to a financial website. Thus, content filters are only effective against unsophisticated users.
As mentioned above, the SSL/TLS chain-of-trust does rely on trusted root certificate authorities; in a workplace setting where the client is managed by the organization, trust might be granted to a root certificate whose private key is known to the proxy. Concretely, a root certificate generated by the proxy is installed into the browser CA list by IT staff. In such scenarios, proxy analysis of the contents of a SSL/TLS transaction becomes possible. The proxy is effectively operating a man-in-the-middle attack, allowed by the client's trust of a root certificate the proxy owns.
A special case of web proxies is "CGI proxies". These are web sites that allow a user to access a site through them. They generally use PHP or CGI to implement the proxy functionality. These types of proxies are frequently used to gain access to web sites blocked by corporate or school proxies. Since they also hide the user's own IP address from the web sites they access through the proxy, they are sometimes also used to gain a degree of anonymity, called "Proxy Avoidance".
[edit]Caching
A caching proxy server accelerates service requests by retrieving content saved from a previous request made by the same client or even other clients. Caching proxies keep local copies of frequently requested resources, allowing large organizations to significantly reduce their upstream bandwidth usage and costs, while significantly increasing performance. Most ISPs and large businesses have a caching proxy. Caching proxies were the first kind of proxy server.
Some poorly-implemented caching proxies have had downsides (e.g., an inability to use user authentication). Some problems are described in RFC 3143 (Known HTTP Proxy/Caching Problems).
Another important use of the proxy server is to reduce the hardware cost. An organization may have many systems on the same network or under control of a single server, prohibiting the possibility of an individual connection to the Internet for each system. In such a case, the individual systems can be connected to one proxy server, and the proxy server connected to the main server.
[edit]Bypassing filters and censorship
If the destination server filters content based on the origin of the request, the use of a proxy can remove this filter. For example, a server using IP-based geolocation to restrict its service to a certain country can be accessed using a proxy located in that country to access the service.
Likewise, a badly configured proxy can provide access to a network otherwise isolated from the Internet.[4]
[edit]Logging and eavesdropping
Proxies can be installed in order to eavesdrop upon the data-flow between client machines and the web. All content sent or accessed – including passwords submitted and cookies used – can be captured and analyzed by the proxy operator. For this reason, passwords to online services (such as webmail and banking) should always be exchanged over a cryptographically secured connection, such as SSL.
By chaining proxies which do not reveal data about the original requester, it is possible to obfuscate activities from the eyes of the user's destination. However, more traces will be left on the intermediate hops, which could be used or offered up to trace the user's activities. If the policies and administrators of these other proxies are unknown, the user may fall victim to a false sense of security just because those details are out of sight and mind.
In what is more of an inconvenience than a risk, proxy users may find themselves being blocked from certain Web sites, as numerous forums and Web sites block IP addresses from proxies known to have spammed or trolled the site. Proxy bouncing can be used to maintain your privacy.

[edit]Gateways to private networks
Proxy servers can perform a role similar to a network switch in linking two networks.
[edit]Accessing services anonymously
Main article: anonymizer
An anonymous proxy server (sometimes called a web proxy) generally attempts to anonymize web surfing. There are different varieties of anonymizers. The destination server (the server that ultimately satisfies the web request) receives requests from the anonymizing proxy server, and thus does not receive information about the end user's address. However, the requests are not anonymous to the anonymizing proxy server, and so a degree of trust is present between the proxy server and the user. Many of them are funded through a continued advertising link to the user.
Access control: Some proxy servers implement a logon requirement. In large organizations, authorized users must log on to gain access to the web. The organization can thereby track usage to individuals.
Some anonymizing proxy servers may forward data packets with header lines such as HTTP_VIA, HTTP_X_FORWARDED_FOR, or HTTP_FORWARDED, which may reveal the IP address of the client. Other anonymizing proxy servers, known as elite or high anonymity proxies, only include the REMOTE_ADDR header with the IP address of the proxy server, making it appear that the proxy server is the client. A website could still suspect a proxy is being used if the client sends packets which include a cookie from a previous visit that did not use the high anonymity proxy server. Clearing cookies, and possibly the cache, would solve this problem.
[edit]Implementations of proxies

[edit]Web proxy
A proxy that focuses on World Wide Web traffic is called a "web proxy". The most common use of a web proxy is to serve as a web cache. Most proxy programs provide a means to deny access to URLs specified in a blacklist, thus providing content filtering. This is often used in a corporate, educational, or library environment, and anywhere else where content filtering is desired. Some web proxies reformat web pages for a specific purpose or audience, such as for cell phones and PDAs.
[edit]Suffix proxies
A suffix proxy server allows a user to access web content by appending the name of the proxy server to the URL of the requested content (e.g. "en.wikipedia.org.example.com"). Suffix proxy servers are easier to use than regular proxy servers.
[edit]Transparent proxies
An intercepting proxy (also forced proxy or transparent proxy) combines a proxy server with a gateway or router (commonly with NAT capabilities). Connections made by client browsers through the gateway are diverted to the proxy without client-side configuration (or often knowledge). Connections may also be diverted from a SOCKS server or other circuit-level proxies.[5]
RFC 2616 (Hypertext Transfer Protocol—HTTP/1.1) offers standard definitions:
"A 'transparent proxy' is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification".
"A 'non-transparent proxy' is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering".
A security flaw in the way that transparent proxies operate was published by Robert Auger in 2009 [6] and advisory by the Computer Emergency Response Team [7] was issued listing dozens of affected transparent, and intercepting proxy servers.
[edit]Purpose
Intercepting proxies are commonly used in businesses to prevent avoidance of acceptable use policy, and to ease administrative burden, since no client browser configuration is required. This second reason however is mitigated by features such as Active Directory group policy, or DHCP and automatic proxy detection.
Intercepting proxies are also commonly used by ISPs in some countries to save upstream bandwidth and improve customer response times by caching. This is more common in countries where bandwidth is more limited (e.g. island nations) or must be paid for.
[edit]Issues
The diversion / interception of a TCP connection creates several issues. Firstly the original destination IP and port must somehow be communicated to the proxy. This is not always possible (e.g. where the gateway and proxy reside on different hosts). There is a class of cross site attacks which depend on certain behaviour of intercepting proxies that do not check or have access to information about the original (intercepted) destination. This problem can be resolved by using an integrated packet-level and application level appliance or software which is then able to communicate this information between the packet handler and the proxy.
Intercepting also creates problems for HTTP authentication, especially connection-oriented authentication such as NTLM, since the client browser believes it is talking to a server rather than a proxy. This can cause problems where an intercepting proxy requires authentication, then the user connects to a site which also requires authentication.
Finally intercepting connections can cause problems for HTTP caches, since some requests and responses become uncacheble by a shared cache.
Therefore intercepting connections is generally discouraged. However due to the simplicity of deploying such systems, they are in widespread use.
[edit]Implementation Methods
Interception can be performed using Cisco's WCCP (Web Cache Control Protocol). This proprietary protocol resides on the router and is configured from the cache, allowing the cache to determine what ports and traffic is sent to it via transparent redirection from the router. This redirection can occur in one of two ways: GRE Tunneling (OSI Layer 3) or MAC rewrites (OSI Layer 2).
Once traffic reaches the proxy machine itself interception is commonly performed with NAT (Network Address Translation). Such setups are invisible to the client browser, but leave the proxy visible to the web server and other devices on the internet side of the proxy. Recent Linux and some BSD releases provide TPROXY (transparent proxy) which performs IP-level (OSI Layer 3) transparent interception and spoofing of outbound traffic, hiding the proxy IP address from other network devices.
[edit]Detection
There are several methods that can often be used to detect the presence of an intercepting proxy server:
By comparing the client's external IP address to the address seen by an external web server, or sometimes by examining the HTTP headers received by a server. A number of sites have been created to address this issue, by reporting the user's IP address as seen by the site back to the user in a web page.[1]
By comparing the sequence of network hops reported by a tool such as traceroute for a proxied protocol such as http (port 80) with that for a non proxied protocol such as SMTP (port 25). [2],[3]
By attempting to make a connection to an IP address at which there is known to be no server. The proxy will accept the connection and then attempt to proxy it on. When the proxy finds no server to accept the connection it may return an error message or simply close the connection to the client. This difference in behaviour is simple to detect. For example most web browsers will generate a browser created error page in the case where they cannot connect to an HTTP server but will return a different error in the case where the connection is accepted and then closed.[8]
[edit]Tor onion proxy software
Main article: Tor (anonymity network)


The Vidalia Tor-network map.
The Tor anonymity network ('Tor' for short) is a system aiming at online anonymity.[9] Tor is an implementation of onion routing. It works by relaying communications through a network of systems run by volunteers in various locations. By keeping some of the network entry points hidden, Tor is also able to evade internet censorship.[10] Tor is intended to protect users' personal freedom, privacy, and ability to conduct confidential business.[11]
Users of a Tor network run an onion proxy software on their computer. The Tor software periodically negotiates a virtual circuit through the Tor network. At the same time, the onion proxy software presents a SOCKS interface to its clients or users. SOCKS-ifying applications like Polipo may be linked with the Tor onion proxy software, which then multiplexes the traffic through a Tor virtual circuit.
The software is open source and the network is free of charge to use. Vidalia is a cross-platform controller GUI for Tor.
This section requires expansion.
[edit]I2P anonymous proxy
Main article: I2P
The I2P anonymous network ('I2P') is a proxy network aiming at online anonymity. It implements garlic routing, which is an enhancement of Tor's onion routing. I2P is fully distributed and works by encrypting all communications in various layers and relaying them through a network of routers run by volunteers in various locations. By keeping the source of the information hidden, I2P offers censorship resistance. The goals of I2P are to protect users' personal freedom, privacy, and ability to conduct confidential business.
Each user of I2P runs an I2P router on their computer (node). The I2P router takes care of finding other peers and building anonymizing tunnels through them. I2P provides proxies for all protocols (HTTP, irc, SOCKS, ...).
The software is free and open-source, and the network is free of charge to use.
[edit]See also

[edit]Overview & Discussions
Web accelerator which discusses host-based HTTP acceleration
Transparent SMTP proxy
Reverse proxy which discusses origin-side proxies
Comparison of web servers
Comparison of lightweight web servers
[edit]Proxy-Servers
Apache HTTP Server
Apache Traffic Server - high-performance open-source HTTP proxy server
lighttpd - open-source web server, optimized for speed-critical environments
Microsoft Forefront Threat Management Gateway, (ISA), forward and reverse caching proxy and firewall
Nginx - lightweight, high-performance web server, reverse proxy and e-mail proxy (IMAP/POP3)
Polipo - lightweight pipelining, multiplexing, forwarding and caching proxy, SOCKS proxy and daemon
Pound reverse proxy
Privoxy - privacy enhancing proxy
Squid cache - a proxy server and web cache daemon
Tinyproxy - a fast and small HTTP proxy server daemon, which supports reverse proxying and transparent proxying
Varnish - a performance-focused open source reverse proxy
WinGate - multi-protocol forward/reverse/caching proxy and packet firewall / NAT for Windows platforms.
Ziproxy - lightweight forwarding, non-caching, HTTP proxy for traffic optimization
[edit]Diverse Topics
Application layer firewall
Captive portal
Distributed Checksum Clearinghouse
Internet privacy
Proxy list

APPLICATION SERVER

An application server is a software framework that provides an environment where applications can run, no matter what the applications are or what they do.[1] It is dedicated to the efficient execution of procedures (programs, routines, scripts) for supporting the construction of applications.
The term was originally used when discussing early client–server systems to differentiate servers that run SQL services[2] and middleware servers from file servers.
Later, the term took on the meaning of Web applications, but has since evolved further into more of a comprehensive service layer. An application server acts as a set of components accessible to the software developer through an API defined by the platform itself. For Web applications, these components are usually performed in the same machine where the Web server is running, and their main job is to support the construction of dynamic pages. However, present-day application servers target much more than just Web pages generation, they implement services like clustering, fail-over and load-balancing, so developers can be focused just on implementing the business logic.[3]
Normally the term refers to Java application servers. When this is the case, the application server behaves like an extended virtual machine for the running applications, handling transparently connections to the database at one side, and connections to the Web client at the other.
Other uses of the term may refer to the services that a server makes available or the computer hardware on which the services run


Java application servers

The Web modules include servlets, JavaServer Pages and Enterprise JavaBeans. Business logic resides in Enterprise JavaBeans - a modular server component providing many features, mostly improving application scalability. The Hibernate project offers an EJB-3 container implementation for the JBoss application server. Tomcat from Apache and JOnAS from ObjectWeb exemplify typical containers that can store these modules. The EAServer is from Sybase inc. Sun, now Oracle Glassfish Application server is the most comprehensive Java Enterprise Application Server avalible in both community and licence based versions.
A Java Server Page (JSP) (a servlet from Java — the Java equivalent of a CGI script) executes in a Web container. JSPs provide a way to create HTML pages by embedding references to the server logic within the page. HTML coders and Java programmers can work side by side by referencing each other's code from within their own.
The application servers mentioned above mainly serve Web applications. Some application servers target networks other than web-based ones: Session Initiation Protocol servers, for instance, target telephony networks.
[edit].NET Framework

[edit]Microsoft
Microsoft positions their middle-tier applications and services infrastructure in the Windows Server operating system and the .NET Framework technologies in the role of an application server.
[edit]Third-party
Mono (not fully .NET compatible), developed by Novell, Inc., licensed under GPL.
Base4 Application Server, an open source project
TNAPS Application Server, freeware application server, developed by TN LLC,
[edit]PHP application servers

Are used for running and managing PHP applications.
Zend Server, built by Zend Technologies, provides application server functionality for PHP-based applications
[edit]Other platforms

Open-source application servers also come from other vendors. Examples include:
Appaserver
Spring Framework
Non-Java offerings have no formal interoperability specifications on par with the Java Specification Request. As a result, interoperability between non-Java products is poor compared to that of Java EE based products. To address these shortcomings, specifications for enterprise application integration and service-oriented architecture were designed[by whom?] to connect the many different products. These specifications include Business Application Programming Interface, Web Services Interoperability, and Java EE Connector Architecture.
[edit]Advantages of application servers

Data and code integrity
By centralizing business logic on an individual server or on a small number of server machines, updates and upgrades to the application for all users can be guaranteed. There is no risk of old versions of the application accessing or manipulating data in an older, incompatible manner.
Centralized configuration
Changes to the application configuration, such as a move of database server, or system settings, can take place centrally.
Security
A central point through which service-providers can manage access to data and portions of the application itself counts as a security benefit, devolving responsibility for authentication away from the potentially insecure client layer without exposing the database layer.
Performance
By limiting the network traffic to performance-tier traffic the client–server model improves the performance of large applications in heavy usage environments.[citation needed]
Total Cost of Ownership (TCO)
In combination, the benefits above may result in cost savings to an organization developing enterprise applications. In practice, however, the technical challenges of writing software that conforms to that paradigm, combined with the need for software distribution to distribute client code, somewhat negate these benefits.[citation needed]
Transaction Support
A transaction represents a unit of activity in which many updates to resources (on the same or distributed data sources) can be made atomic (as an indivisible unit of work). End-users can benefit from a system-wide standard behaviour, from reduced time to develop, and from reduced costs. As the server does a lot of the tedious code-generation, developers can focus on business logic.