Appendix: R Reference - R in a Nutshell
Pages: 1, 2, 3, 4, 5

boot

This package provides functions for bootstrapresampling.

Functions

FunctionDescription
EEF.profileCalculates the log-likelihood for a mean using anempirical exponential family likelihood.
EL.profileCalculates the log-likelihood for a mean using anempirical likelihood.
abc.ciCalculates equitailed two-sided nonparametricapproximate bootstrap confidence intervals for a parameter, given a set of data and an estimator of the parameter, usingnumerical differentiation.
bootGenerates Rbootstrap replicates of a statistic applied to data.
boot.arrayTakes a bootstrap object calculated by one of thefunctions boot, censboot, or tilt.boot and returns the frequency(or index) array for the bootstrap resamples.
boot.ciGenerates five different types of equitailed two-sidednonparametric confidence intervals. These are the first-ordernormal approximation, the basic bootstrap interval, theStudentized bootstrap interval, the bootstrap percentileinterval, and the adjusted bootstrap percentile (BCa)interval. All or a subset of these intervals can begenerated.
censbootApplies types of bootstrap resampling that have beensuggested to deal with right-censored data. It can alsoperform model-based resampling using a Cox regressionmodel.
controlFinds control variate estimates from a bootstrap outputobject.
corrCalculates the weighted correlation given a data setand a set of weights.
cum3Calculates an estimate of the third cumulant, orskewness, of a vector. Also, if more than one vector isspecified, a product-moment of order 3 is estimated.
cv.glmCalculates the estimated K-foldcross-validation prediction error for generalized linearmodels.
empinfCalculates the empirical influence values for astatistic applied to a data set.
envelopeCalculates overall and pointwise confidence envelopesfor a curve based on bootstrap replicates of the curveevaluated at a number of fixed points.
exp.tiltCalculates exponentially tilted multinomialdistributions such that the resampling distributions of thelinear approximation to a statistic have the requiredmeans.
freq.arrayTakes a matrix of indices for nonparametric bootstrapresamples and returns the frequencies of the originalobservations in each resample.
glm.diagCalculates jackknife deviance residuals, standardizeddeviance residuals, standardized Pearson residuals, approximate Cook statistic, leverage, and estimated dispersion.
glm.diag.plotsMakes plot of jackknife deviance residuals againstlinear predictor, normal scores plots of standardized devianceresiduals, plot of approximate Cook statistics againstleverage/(1 − leverage), and case plot of Cookstatistic.
imp.moments, imp.prob, imp.quantileCentral moment, tail probability, and quantileestimates for a statistic under importance resampling.
imp.weightsCalculates the importance sampling weight required tocorrect for simulation from a distribution with probabilitiesp when estimates arerequired assuming that simulation was from an alternativedistribution with probabilities q.
inv.logitGiven a numeric object, returns the inverse logit ofthe values.
jack.after.bootCalculates the jackknife influence values from abootstrap output object and plots the correspondingjackknife-after-bootstrap plot.
k3.linearEstimates the skewness of a statistic from itsempirical influence values.
lik.CIFunction for use with the practicals in Davison andHinkley (1997), Bootstrap Methods and TheirApplications, Cambridge Series in Statistical andProbabilistic Mathematics, No. 1.
linear.approxTakes a bootstrap object and, for each bootstrapreplicate, calculates the linear approximation to thestatistic of interest for that bootstrap sample.
logitCalculates the logit of proportions.
nested.corrFunction for use with the practicals in Davison andHinkley (1997), Bootstrap Methods and TheirApplications, Cambridge Series in Statistical andProbabilistic Mathematics, No. 1.
norm.ciUsing the normal approximation to a statistic, calculates equitailed two-sided confidence intervals.
saddleCalculates a saddlepoint approximation to thedistribution of a linear combination of Wat a particular point u, where W is a vector of randomvariables.
saddle.distnApproximates an entire distribution using saddlepointmethods.
simplexThis function will optimize the linear functiona\%*\%x subject to theconstraints A1\%*\%x <=b1, A2\%*\%x >=b2, A3\%*\%x =b3, and x >=0. Either maximization or minimization is possiblebut the default is minimization.
smooth.fUses the method of frequency smoothing to find adistribution on a data set that has a required value, theta, of the statistic ofinterest.
tilt.bootThis function will run an initial bootstrap with equalresampling probabilities (if required) and will use the outputof the initial run to find resampling probabilities that putthe value of the statistic at required values. It then runs animportance resampling bootstrap using the calculatedprobabilities as the resampling distribution.
tsbootGenerates Rbootstrap replicates of a statistic applied to a time series.The replicate time series can be generated using fixed orrandom block lengths or can be model-based replicates.
var.linearEstimates the variance of a statistic from itsempirical influence values.

Data Sets

Data SetClassDescription
acmedata.frameThe acme data framehas 60 rows and 3 columns. The excess returns for the AcmeCleveland Corporation, along with those for all stocks listedon the New York and American Stock Exchanges, were recordedover a 5-year period. These excess returns are relative to thereturn on a riskless investment such as U.S. Treasurybills.
aidsdata.frameThe aids data framehas 570 rows and 6 columns. Although all cases of AIDS inEngland and Wales must be reported to the Communicable DiseaseSurveillance Centre, there is often a considerable delaybetween the time of diagnosis and the time that it isreported. In estimating the prevalence of AIDS, account mustbe taken of the unknown number of cases that have beendiagnosed but not reported. The data set here records thereported cases of AIDS diagnosed from July 1983 until the endof 1992. The data is cross-classified by the date of diagnosisand the time delay in the reporting of the cases.
airconditdata.frameProschan reported on the times between failures of theair-conditioning equipment in 10 Boeing 720 aircraft. Theaircondit data framecontains the intervals for the ninth aircraft, while aircondit7 contains those for theseventh aircraft. Both data frames have just one column. Notethat the data has been sorted into increasing order.
aircondit7data.frameProschan reported on the times between failures of theair-conditioning equipment in 10 Boeing 720 aircraft. Theaircondit data framecontains the intervals for the ninth aircraft, while aircondit7 contains those for theseventh aircraft. Both data frames have just one column. Notethat the data has been sorted into increasing order.
amisdata.frameThe amis data framehas 8, 437 rows and 4 columns. In a study into the effect thatwarning signs have on speeding patterns, Cambridgeshire CountyCouncil considered 14 pairs of locations. The locations werepaired to account for factors such as traffic volume and typeof road. One site in each pair had a sign erected warning ofthe dangers of speeding and asking drivers to slow down. Noaction was taken at the second site. Three sets ofmeasurements were taken at each site. Each set of measurementswas nominally of the speeds of 100 cars, but not all siteshave exactly 100 measurements. These speed measurements weretaken before the erection of the sign, shortly after theerection of the sign, and again after the sign had been inplace for some time.
amldata.frameThe aml data framehas 23 rows and 3 columns. A clinical trial to evaluate theefficacy of maintenance chemotherapy for acute myelogenousleukemia was conducted by Embury et al. at StanfordUniversity. After reaching a stage of remission throughtreatment by chemotherapy, patients were randomized into twogroups. The first group received maintenance chemotherapy, andthe second group did not. The aim of the study was to see ifmaintenance chemotherapy increased the length of theremission. The data here formed a preliminary analysis thatwas conducted in October 1974.
beavertsThe beaver dataframe has 100 rows and 4 columns. It is a multivariate timeseries of class "ts" andalso inherits from class "data.frame". This data set is partof a long study into body temperature regulation in beavers.Four adult female beavers were live-trapped and had atemperature-sensitive radio transmitter surgically implanted.Readings were taken every 10 minutes. The location of thebeaver was also recorded, and her activity level wasdichotomized by whether she was in the retreat or outside ofit, since high-intensity activities only occur outside of theretreat. The data in this data frame comes from those readingsfor one of the beavers on a day in autumn.
bigcitydata.frameThe bigcity dataframe has 49 rows and 2 columns. The city data frame has 10 rows and 2columns. The measurements are the populations (in 1000s) of 49U.S. cities in 1920 and 1930. The 49 cities are a randomsample taken from the 196 largest cities in 1920. The city data frame consists of thefirst 10 observations in bigcity.
bramblesdata.frameThe brambles dataframe has 823 rows and 3 columns. The location of livingbramble canes in a 9-m square plot was recorded. We take 9 mto be the unit of distance so that the plot can be thought ofas a unit square. The bramble canes were also classified bytheir age.
breslowdata.frameThe breslow dataframe has 10 rows and 5 columns. In 1961, Doll and Hill sentout a questionnaire to all men on the British Medical Registerinquiring about their smoking habits. Almost 70% of the menreplied. Death certificates were obtained for medicalpractitioners, and causes of death were assigned on the basisof these certificates. The breslow data set contains theperson-years of observations and deaths from coronary arterydisease accumulated during the first 10 years of thestudy.
calciumdata.frameThe calcium dataframe has 27 rows and 2 columns. Howard Grimes of the BotanyDepartment, North Carolina State University, conducted anexperiment for biochemical analysis of intracellular storageand transport of calcium across plasma membrane. Cells weresuspended in a solution of radioactive calcium for a certainlength of time, and then the amount of radioactive calciumthat was absorbed by the cells was measured. The experimentwas repeated independently with nine different times ofsuspension each replicated three times.
canedata.frameThe cane data framehas 180 rows and 5 columns. The data frame represents arandomized block design with 45 varieties of sugarcane and 4blocks. The aim of the experiment was to classify thevarieties into resistant, intermediate, and susceptible to adisease called "coal of sugarcane" (carvao dacana-de-acucar). This is a disease that is commonin sugar-cane plantations in certain areas of Brazil. For eachplot, 50 pieces of sugarcane stem were put in a solutioncontaining the disease agent, and then some were planted inthe plot. After a fixed period of time, the total number ofshoots and the number of diseased shoots wererecorded.
capabilitydata.frameThe capability dataframe has 75 rows and 1 column. The data consists of simulatedsuccessive observations from a process in equilibrium. Theprocess is assumed to have specification limits (5.49, 5.79).
catsMdata.frameThe catsM data framehas 97 rows and 3 columns. One hundred and forty-four adult(over 2 kg in weight) cats used for experiments with the drugdigitalis had their heart and body weight recorded.Forty-seven of the cats were female, and 97 were male. ThecatsM data frame consistsof the data for the male cats. The full data can be found indata set \link[MASS]{cats}in package MASS.
cavdata.frameThe cav data framehas 138 rows and 2 columns. The data gives the positions ofthe individual caveolae in a square region with sides oflength 500 units. This grid was originally on a 2.65μm squareof muscle fiber. The data consist of those points falling inthe lower-left quarter of the region used for the data setcaveolae.dat.
cd4data.frameThe cd4 data framehas 20 rows and 2 columns. CD4 cells are carried in the bloodas part of the human immune system. One of the effects of thehuman immunodeficiency virus (HIV) is that these cells die.The count of CD4 cells is used in determining the onset offull-blown AIDS in a patient. In this study of theeffectiveness of a new antiviral drug on HIV, 20 HIV-positivepatients had their CD4 counts recorded and then were put on acourse of treatment with this drug. After using the drug for 1year, their CD4 counts were again recorded. The aim of theexperiment was to show that patients taking the drug hadincreased CD4 counts, which is not generally seen in HIV-positive patients.
cd4.nestedbootThis is an example of a nested bootstrap for thecorrelation coefficient of the cd4 data frame.
channingdata.frameThe channing dataframe has 462 rows and 5 columns. Channing House is aretirement center in Palo Alto, California. The data wascollected between the opening of the house in 1964 until July1, 1975. During that time, 97 men and 365 women passed throughthe center. For each of these, their age on entry and also onleaving or death was recorded. A large number of theobservations were censored mainly due to the resident beingalive on July 1, 1975, when the data was collected. Over thecourse of the study, 130 women and 46 men died at ChanningHouse. Differences between the survival of the sexes, takingage into account, was one of the primary concerns of thisstudy.
citydata.frameThe bigcity dataframe has 49 rows and 2 columns. The city data frame has 10 rows and 2columns. The measurements are the populations (in 1000s) of 49U.S. cities in 1920 and 1930. The 49 cities are a randomsample taken from the 196 largest cities in 1920. The city data frame consists of thefirst 10 observations in bigcity.
claridgedata.frameThe claridge dataframe has 37 rows and 2 columns. The data comes from anexperiment that was designed to look for a relationshipbetween a certain genetic characteristic and handedness. The37 subjects were women who had a son with mental retardationdue to inheriting a defective X-chromosome. For each suchmother, a genetic measurement of her DNA was made. Largervalues of this measurement are known to be linked to thedefective gene, and it was hypothesized that larger valuesmight also be linked to a progressive shift away fromright-handedness. Each woman also filled in a questionnaireregarding which hand she used for various tasks. From thesequestionnaires, a measure of hand preference was found foreach mother. The scale of this measure goes from 1, indicating women who alwaysfavor their right hand, to 8, indicating women who alwaysfavor their left hand. Between these two extremes are womenwho favor one hand for some tasks and the other for othertasks.
clothdata.frameThe cloth data framehas 32 rows and 2 columns.
co.transferdata.frameThe co.transfer dataframe has 7 rows and 2 columns. Seven smokers with chickenpoxhad their levels of carbon monoxide transfer measured uponbeing admitted to the hospital and then again after 1 week.The main question was whether 1 week of hospitalization hadchanged the carbon monoxide transfer factor.
coaldata.frameThe coal data framehas 191 rows and 1 column. This data frame gives the dates of191 explosions in coal mines that resulted in 10 or morefatalities. The time span of the data is from March 15, 1851, until March 22, 1962.
darwindata.frameThe darwin dataframe has 15 rows and 1 column. Charles Darwin conducted anexperiment to examine the superiority of cross-fertilizedplants over self-fertilized plants. Fifteen pairs of plantswere used. Each pair consisted of one cross-fertilized plantand one self-fertilized plant that germinated at the same timeand grew in the same pot. The plants were measured at a fixedtime after planting, and the differences in heights betweenthe cross- and self-fertilized plants were recorded in eighthsof an inch.
dogsdata.frameThe dogs data framehas 7 rows and 2 columns. Data on the cardiac oxygenconsumption and left ventricular pressure was gathered onseven domestic dogs.
downs.bcdata.frameThe downs.bc dataframe has 30 rows and 3 columns. Down's syndrome is a geneticdisorder caused by an extra chromosome 21 or a part ofchromosome 21 being translocated to another chromosome. Theincidence of Down's syndrome is highly dependent on themother's age and rises sharply after age 30. In the 1960s, alarge-scale study of the effect of maternal age on theincidence of Down's syndrome was conducted at the BritishColumbia Health Surveillance Registry. This data frameconsists of the data that was collected in that study. Motherswere classified by age. Most groups correspond to the age inyears, but the first group comprises all mothers aged 15–17and the last is those aged 46–49. No data for mothers over 50or below 15 was collected.
ducksdata.frameThe ducks data framehas 11 rows and 2 columns. Each row of the data framerepresents a male duck that is a second-generation crossbetween a mallard and a pintail. For 11 such ducks, abehavioral index and plumage index were calculated. These weremeasured on scales devised for this experiment, which was toexamine whether there was any link between which species theducks resembled physically and which they resembled inbehavior. The scale for physical appearance ranged from 0(identical in appearance to a mallard) to 20 (identical to apintail). The behavioral traits of the ducks were on a scaleof 0 to 15, with lower numbers indicating more mallard-likebehavior.
firdata.frameThe fir data framehas 50 rows and 3 columns. The number of balsam-fir seedlingsin each quadrant of a grid of 50 five-foot-square quadrantswere counted. The grid consisted of 5 rows of 10 quadrants ineach row.
fretsdata.frameThe frets data framehas 25 rows and 4 columns. The data consists of measurementsof the length and breadth of the heads of pairs of adultbrothers in 25 randomly sampled families. All measurements areexpressed in millimeters.
gravdata.frameThe gravity dataframe has 81 rows and 2 columns. The grav data set has 26 rows and 2columns. Between May 1934 and July 1935, the U.S. NationalBureau of Standards conducted a series of experiments toestimate the acceleration due to gravity, g, at Washington, DC. Each experimentproduced a number of replicate estimates ofg using the same methodology. Althoughthe basic method remained the same for all experiments, thatof the reversible pendulum, there were changes inconfiguration. The gravitydata frame contains the data from all eight experiments. Thegrav data frame containsthe data from experiments 7 and 8. The data is expressed asdeviations from 980.000 in centimeters per secondsquared.
gravitydata.frameThe gravity dataframe has 81 rows and 2 columns. The grav data set has 26 rows and 2columns. Between May 1934 and July 1935, the U.S. NationalBureau of Standards conducted a series of experiments toestimate the acceleration due to gravity, g, at Washington, DC. Each experimentproduced a number of replicate estimates ofg using the same methodology. Althoughthe basic method remained the same for all experiments, thatof the reversible pendulum, there were changes inconfiguration. The gravitydata frame contains the data from all eight experiments. Thegrav data frame containsthe data from experiments 7 and 8. The data is expressed asdeviations from 980.000 in centimeters per secondsquared.
hirosedata.frameThe hirose dataframe has 44 rows and 3 columns. PET film is used inelectrical insulation. In this accelerated life test, thefailure times for 44 samples in gas-insulated transformerswere estimated. Four different voltage levels wereused.
islaydata.frameThe islay data framehas 18 rows and 1 column. Measurements were taken of paleocurrent azimuths from theJura Quartzite on the Scottish island of Islay.
manaustsThe manaus timeseries is of class "ts" andhas 1, 080 observations on one variable. The data values aremonthly averages of the daily stages (heights) of the RioNegro at Manaus. Manaus is 18 km upstream from the confluenceof the Rio Negro with the Amazon but because of the tiny slopeof the water surface and the lower courses of its flatlandaffluents, they may be regarded as a good approximation of thewater level in the Amazon at the confluence. The data herecovers 90 years from January 1903 until December 1992. TheManaus gauge is tied in with an arbitrary benchmark of 100mset in the steps of the Municipal Prefecture; gauge readingsare usually referred to sea level, on the basis of a mark onthe steps leading to the Parish Church (Matriz), which isassumed to lie at an altitude of 35.874 m according toobservations made many years ago under the direction of SamuelPereira, an engineer in charge of the Manaus SanitationCommittee Whereas such an altitude cannot, by any means, beconsidered to be a precise datum point, observations have beenprovisionally referred to it. The measurements are inmeters.
melanomadata.frameThe melanoma dataframe has 205 rows and 7 columns. The data consists of measurements made on patientswith malignant melanoma. Each patient had his or her tumorsurgically removed at the Department of Plastic Surgery, University Hospital of Odense, Denmark, during the period1962–1977. The surgery consisted of complete removal of thetumor together with about 2.5 cm of the surrounding skin.Among the measurements taken were the thickness of the tumorand whether it was ulcerated or not. These are thought to beimportant prognostic variables in that patients with a thickand/or ulcerated tumor have an increased chance of death frommelanoma. Patients were followed until the end of1977.
motordata.frameThe motor data framehas 94 rows and 4 columns. The rows were obtained by removingreplicate values of timefrom the data set mcycle.Two extra columns were added to allow for strata with adifferent residual variance in each stratum.
neuromatrixneuro is a matrixcontaining times of observed firing of a neuron in windows of250 ms either side of the application of a stimulus to a humansubject. Each row of the matrix is a replication of theexperiment, and there are a total of 469 replicates.
nitrofendata.frameThe nitrofen dataframe has 50 rows and 5 columns. Nitrofen is a herbicide thatwas used extensively for the control of broad-leaved and grassweeds in cereals and rice. Although it is relatively nontoxicto adult mammals, nitrofen is a significant teratogen andmutagen. It is also acutely toxic and reproductively toxic tocladoceran zooplankton. Nitrofen is no longer incommercial use in the United States, having been the firstpesticide to be withdrawn due to teratogenic effects. The datahere comes from an experiment to measure the reproductivetoxicity of nitrofen on a species of zooplankton(Ceriodaphnia dubia). Fifty animals wererandomized into batches of 10, and each batch was put in asolution with a measured concentration of nitrofen. Then thenumber of live offspring in each of the three broods of eachanimal was recorded.
nodaldata.frameThe nodal data framehas 53 rows and 7 columns. The treatment strategy for apatient diagnosed with prostate cancer depends highly onwhether the cancer has spread to the surrounding lymph nodes.It is common to operate on the patient to get samples from thenodes, which can then be analyzed under a microscope, butclearly it would be preferable if an accurate assessment ofnodal involvement could be made without surgery. For a sampleof 53 prostate cancer patients, a number of possible predictorvariables were measured before surgery. The patients then hadsurgery to determine nodal involvement. The point of the studywas to see if nodal involvement could be accurately predictedfrom the predictor variables and which ones were mostimportant.
nucleardata.frameThe nuclear dataframe has 32 rows and 11 columns. The data relates to theconstruction of 32 light-water reactor (LWR) plantsconstructed in the United States in the late 1960s and early1970s. The data was collected with the aim of predicting thecost of construction of additional LWR plants. Six of thepower plants had partial turnkey guarantees, and it ispossible that, for these plants, some manufacturers' subsidiesmay be hidden in the quoted capital costs.
paulsendata.frameThe paulsen dataframe has 346 rows and 1 column. Sections were prepared fromthe brain of adult guinea pigs. Spontaneous currents thatflowed into individual brain cells were then recorded and thepeak amplitude of each current measured. The aim of theexperiment was to see if the current flow was quantal innature (i.e., that it is not a single burst but instead isbuilt up of many smaller bursts of current). If the currentwas indeed quantal, then it would be expected that thedistribution of the current amplitude would be multimodal withmodes at regular intervals. The modes would be expected todecrease in magnitude for higher current amplitudes.
poisonsdata.frameThe poisons dataframe has 48 rows and 3 columns. The data form a 3 × 4factorial experiment, the factors being three poisons and fourtreatments. Each combination of the two factors was used onfour animals, the allocation to animals having been completelyrandomized.
polardata.frameThe polar data framehas 50 rows and 2 columns. The data consists of the polepositions from a paleomagnetic study of New Caledonianlaterites.
remissiondata.frameThe remission dataframe has 27 rows and 3 columns.
salinitydata.frameThe salinity dataframe has 28 rows and 4 columns. Biweekly averages of thewater salinity and river discharge in Pamlico Sound, NorthCarolina, were recorded between the years 1972 and 1977. Thedata in this set consists only of those measurements in March, April, and May.
survivaldata.frameThe survival dataframe has 14 rows and 2 columns. The data measured thesurvival percentages of batches of rats who were given varyingdoses of radiation. At each of six doses there were two orthree replications of the experiment.
taudata.frameThe tau data framehas 60 rows and 2 columns. The tau particle is a heavyelectron-like particle discovered in the 1970s by Martin Perlat the Stanford Linear Accelerator Center. Soon after itsproduction, the tau particle decays into various collectionsof more stable particles. About 86% of the time, the decayinvolves just one charged particle. This rate has beenmeasured independently 13 times. The one-charged-particleevent is made up of four major modes of decay as well as acollection of other events. The four main types of decay aredenoted rho, pi, e, and mu. These rates have been measuredindependently 6, 7, 14, and 19 times, respectively. Due tophysical constraints, each experiment can only estimate thecomposite one-charged-particle decay rate or the rate of oneof the major modes of decay. Each experiment consists of amajor research project involving many years' work. One of thegoals of the experiments was to estimate the rate of decay dueto events other than the four main modes of decay. These areuncertain events and so cannot themselves be observeddirectly.
tunadata.frameThe tuna data framehas 64 rows and 1 column. The data comes from an aerial linetransect survey of southern bluefin tuna in the GreatAustralian Bight. An aircraft with two spotters on board flewrandomly allocated line transects. Each school of tuna sightedwas counted and its perpendicular distance from the transectmeasured. The survey was conducted in summer when tuna tend tostay on the surface.
urinedata.frameThe urine data framehas 79 rows and 7 columns. Seventy-nine urine specimens wereanalyzed in an effort to determine if certain physicalcharacteristics of the urine might be related to the formationof calcium oxalate crystals.
wooltswool is a timeseries of class "ts" andcontains 309 observations. Each week that the market was open, the Australian Wool Corporation set a floor price thatdetermined its policy on intervention and was therefore areflection of the overall price of wool for the week inquestion. Actual prices paid varied considerably about thefloor price. The series here is the log of the ratio betweenthe price for fine-grade wool and the floor price, each marketweek between July 1976 and June 1984.

class

This package provides functions for classification.

Functions

FunctionDescription
SOM, batchSOMKohonen's self-organizing maps (SOMs) are a crude formof multidimensional scaling.
condenseCondenses training set fork-nearest-neighbor(k-NN) classifier.
knnk-nearest-neighbor classificationfor test set from training set. For each row of the test set, the k-nearest (in Euclideandistance) training set vectors are found, and theclassification is decided by majority vote, with ties brokenat random. If there are ties for the kth nearest vector, allcandidates are included in the vote.
knn.cvk-nearest-neighborcross-validatory classification from training set.
knn1Nearest-neighbor classification for test set fromtraining set. For each row of the test set, the nearestneighbor (by Euclidean distance) training set vector is found, and its classification used. If there is more than one nearestneighbor, a majority vote is used, with ties broken atrandom.
lvq1, lvq2, lvq3Moves examples in a codebook to better represent thetraining set.
lvqinitConstructs an initial codebook for learning vectorquantization (LVQ) methods.
lvqtestClassifies a test set by 1-NN from a specified LVQcodebook.
multieditMultiedit for k-NNclassifier.
olvq1Moves examples in a codebook to better represent thetraining set.
reduce.nnReduces training set for a k-NNclassifier. Used after condense.
somgridPlotting functions for SOM results.

cluster

This package provides functions for clusteranalysis.

Functions

FunctionDescription
agnesComputes agglomerative hierarchical clustering of thedata set.
bannerplotDraws a "banner, " i.e., basically a horizontal barplot visualizing the(agglomerative or divisive) hierarchical clustering or another binary dendrogram structure.
claraComputes a "clara"object, a list representing a clustering of the data intokclusters.
clusplotDraws a two-dimensional (2D) "clusplot" on the currentgraphics device.
coef.hclustComputes the "agglomerative coefficient, " measuring theclustering structure of the data set.
daisyComputes all the pairwise dissimilarities (distances)between observations in the data set.
dianaComputes a divisive hierarchical clustering of the dataset, returning an object of class diana.
ellipsoidPointsComputes points on the ellipsoid boundary, mostly fordrawing.
ellipsoidhullComputes the "ellipsoid hull" or "spanning ellipsoid, "i.e., the ellipsoid of minimal volume ("area" in 2D) such thatall given points lie just inside or on the boundary of theellipsoid.
fannyComputes a fuzzy clustering of the data intokclusters.
lower.to.upper.tri.indsComputes index vectors for extracting or reordering oflower or upper triangular matrices that are stored ascontiguous vectors.
monaReturns a list representing a divisive hierarchicalclustering of a data set with binary variables only.
pamPartitioning (clustering) of the data intokclusters "around medoids, " a more robust version ofk-means clustering.
pltreeGeneric function drawing a clustering tree ("dendrogram") on the currentgraphics device. There is a twins method; see pltree.twins for usage andexamples.
predict.ellipsoidComputes points on the ellipsoid boundary, mostly fordrawing.
silhouetteComputes silhouette information according to a givenclustering in k clusters.
sizeDissReturns the number of observations (samplesize) corresponding to a dissimilarity-like objector, equivalently, the number of rows or columns of a matrixwhen only the lower or upper triangular part (withoutdiagonal) is given. It is nothing else but the inversefunction of f(n)= n(n −1)/2.
sortSilhouetteComputes silhouette information according to a givenclustering in k clusters.
upper.to.lower.tri.indsComputes index vectors for extracting or reordering oflower or upper triangular matrices that are stored ascontiguous vectors.
volumeComputes the volume of a planar object. This is ageneric function and a method for ellipsoid objects.

Data Sets

Data SetClassDescription
agriculturedata.frameGross national product (GNP) per capita and percentageof the population working in agriculture for each countrybelonging to the European Union in 1993.
animalsdata.frameThis data set considers 6 binary attributes for 20animals.
chorSubmatrixThis is a small rounded subset of the C-horizondata.
flowerdata.frameThis data set consists of 8 characteristics for 18popular flowers.
plantTraitsdata.frameThis data set constitutes a description of 136 plantspecies according to biological attributes (morphological orreproductive).
plutondata.frameThe pluton dataframe has 45 rows and 4 columns, containing percentages ofisotopic composition of 45 plutonium batches.
ruspinidata.frameThe Ruspini data set, consisting of 75 points in 4groups, is popular for illustrating clustering techniques.
votes.repubdata.frameA data frame with the percents of votes given to theRepublican candidates in presidential elections from 1856 to1976. Rows represent the 50 states, and columns the 31elections.
xclaradata.frameAn artificial data set consisting of 3, 000 points in 3well-separated clusters of size 1, 000 each.

codetools

This package provides tools for analyzing R code. It ismainly intended to support the other tools in this package and byte codecompilation. See the help file for more information.

foreign

This package provides functions for reading data stored byMinitab, S, SAS, SPSS, Stata, Systat, dBase, and so forth.

Functions

FunctionDescription
data.restoreReads binary data files or data.dump files that were producedin S version 3.
lookup.xportScans a file as a SAS XPORT format library and returnsa list containing information about the SAS library.
read.SReads binary data files or data.dump files that were producedin S version 3.
read.arffReads data from Weka Attribute-Relation File Format(ARFF) files.
read.dbfReads a DBF file into a data frame, convertingcharacter fields to factors and trying to respect NULL fields.
read.dtaReads a file in Stata version 5–10 binary format into adata frame.
read.epiinfoReads data files in the .REC format used by Epi Infoversions 6 and earlier and by EpiData. Epi Info is apublic-domain database and statistics package produced by theU.S. Centers for Disease Control and Prevention, and EpiDatais a freely available data entry and validationsystem.
read.mtpReturns a list with the data stored in a file as aMinitab Portable Worksheet.
read.octaveReads a file in Octave text data format into alist.
read.spssReads a file stored by the SPSS save or export commands.
read.ssdGenerates a SAS program to convert the ssd contents toSAS transport format and then uses read.xport to obtain a dataframe.
read.systatReads a rectangular data file stored by the SystatSAVE command as (legacy)*.sys or, more recently, *.syd files.
read.xportReads a file as a SAS XPORT format library and returnsa list of data.frames.
write.arffWrites data into Weka Attribute-Relation File Format(ARFF) files.
write.dbfTries to write a data frame to a DBF file.
write.dtaWrites the data frame to file in the Stata binaryformat. Does not write array variables unless they can bedrop-ed to avector.
write.foreignExports simple data frames to other statisticalpackages by writing the data as free-format text and writing aseparate file of instructions for the other package to readthe data.

grDevices

This package provides functions for graphics devices andsupport for base and grid graphics.

Functions

FunctionDescription
CIDFontUsed to define the translation of an R graphics fontfamily name to a Type 1 or CID font description, used by both thepostscript and the pdf graphics devices.
Type1FontUsed to define the translation of an R graphics fontfamily name to a Type 1 or CID font description, used by both thepostscript and the pdf graphics devices.
X11Starts a graphics device driver for the X Window System(version 11). This can only be done on machines/accounts thathave access to an X server.
X11.optionsSets options for an X11 device.
X11Font, X11FontsHandle the translation of a device-independent Rgraphics font family name to an X11 font description.
as.graphicsAnnotCoerces an R object into a form suitable for graphicsannotation.
bitmapGenerates a graphics file. dev2bitmap copies the currentgraphics device to a file in a graphics format.
bmpGraphics device for generating BMP(bitmap)files.
boxplot.statsThis function is typically called by another functionto gather the statistics necessary for producing box plots, but may be invoked separately.
cairo_pdfA Cairo-based graphics device for generating PDFfiles.
cairo_psA Cairo-based graphics device for generating PostScriptfiles.
check.optionsUtility function for setting options with someconsistency checks. The attributes of the new settings innew are checked forconsistency with the model (oftendefault) list in name.opt.
chullComputes the subset of points that lie on the convexhull of the set of points specified.
cmTranslates from inches to centimeters (cm).
cm.colorsCreates a vector of n contiguouscolors.
col2rgbR color to RGB (red/green/blue) conversion.
colorConverterSpecifies color spaces for use in convertColor.
colorRamp, colorRampPaletteThese functions return functions that interpolate a setof given colors to create new color palettes (like topo.colors) and color ramps, functions that map the interval [0, 1] to colors (likegray).
colors, coloursReturns the built-in color names that R knowsabout.
contourLinesCalculates contour lines for a given set ofdata.
convertColorConverts colors between standard color spacerepresentations. This function is experimental.
densColsProduces a vector containing colors that encode thelocal densities at each point in a scatter plot.
dev.controlAllows the user to control the recording of graphicsoperations in a device.
dev.copyCopies the graphics contents of the current device tothe device specified by which or to a new device that hasbeen created by the function specified by device (it is an error to specifyboth which and device).
dev.copy2epsCopies the graphics contents of the current device toan Encapsulated PostScript Format (EPSF) output file inportrait orientation (horizontal =FALSE).
dev.copy2pdfCopies the graphics contents of the current device to aPDF output file in portrait orientation (horizontal = FALSE).
dev.curReturns a named integer vector of length 1, giving thenumber and name of the active device, or 1, the null device, if none is active.
dev.interactiveTests if the current graphics device (or that whichwould be opened) is interactive.
dev.listReturns the numbers of all open devices, except device1, the null device. This is a numeric vector with a namesattribute giving the device names, or NULL if there is no opendevice.
dev.newOpens a new graphics device.
dev.nextReturns the number and name of the next device in thelist of devices.
dev.offShuts down the specified (by default the current)graphics device.
dev.prevReturns the number and name of the previous device inthe list of devices.
dev.printCopies the graphics contents of the current device to anew device that has been created by the function specified bydevice and then shuts thenew device.
dev.setMakes the specified graphics device the activedevice.
dev.sizeFinds the dimensions of the device surface of thecurrent device.
dev2bitmapbitmap generates agraphics file. dev2bitmapcopies the current graphics device to a file in a graphicsformat.
devAskNewPageUsed to control (for the current device) whether theuser is prompted before starting a new page of output.
deviceIsInteractiveTests if the current graphics device (or that whichwould be opened) is interactive.
embedFontsRuns Ghostscript to process a PDF or PostScript fileand embed all fonts in the file.
extendrangeExtends a numeric range by a small percentage, i.e., fraction, on both sides.
getGraphicsEventWaits for input from a graphics window in the form of amouse or keyboard event.
graphics.offProvides control over multiple graphicsdevices.
grayCreates a vector of colors from a vector of graylevels.
gray.colorsCreates a vector of n gamma-corrected graycolors.
greyCreates a vector of colors from a vector of graylevels.
grey.colorsCreates a vector of n gamma-corrected graycolors.
hclCreates a vector of colors from vectors specifying hue, chroma, and luminance.
heat.colorsCreates a vector of n contiguous colors.
hsvCreates a vector of colors from vectors specifying hue, saturation, and value.
jpegCreates a graphics device for generating JPEG formatfiles.
make.rgbSpecifies color spaces for use in convertColor.
n2mfrowEasy setup for plotting multiple figures (in arectangular layout) on one page. This computes a sensibledefault for par(mfrow).
nclass.FDComputes the number of classes for a histogram usingthe Freedman-Diaconis choice based on the interquartile range(IQR), unless that's 0, where it reverts to mad(x, constant = 2), and when thatis 0 as well, returns 1.
nclass.SturgesComputes the number of classes for a histogram usingSturges's formula, implicitly basing bin sizes on the range ofthe data.
nclass.scottComputes the number of classes for a histogram usingScott's choice for a normal distribution based on the estimateof the standard error, unless that is 0 where it returns1.
paletteViews or manipulates the color palette that is usedwhen a col= has a numericindex.
pdfStarts the graphics device driver for producing PDFgraphics.
pdf.optionsThe auxiliary function pdf.options can be used to set orview (if called without arguments) the default values for someof the arguments to pdf.
pdfFontsLists existing mapping for PDF fonts or creates newmappings.
pictexProduces graphics suitable for inclusion in TeX andLaTeX documents.
pngCreates a new graphics device for producing PortableNetwork Graphics (PNG) files.
postscriptStarts the graphics device driver for producingPostScript graphics.
postscriptFontsLists existing mapping for PostScript fonts or createsnew mappings.
ps.optionsThe auxiliary function ps.options can be used to set orview (if called without arguments) the default values for someof the arguments to postscript.
quartzStarts a graphics device driver for the Mac OS Xsystem.
quartz.optionsSets options for a quartz device.
quartzFontTranslates from a device-independent R graphics fontfamily name to a quartz font description.
quartzFontsLists existing mappings of device-independent Rgraphics to a quartz font description, or defines newmappings.
rainbowCreates a vector of n contiguous colors.
recordGraphicsRecords arbitrary code on the graphics engine displaylist. Useful for encapsulating calculations with graphicaloutput that depends on the calculations. Intendedonly for expert use.
recordPlot, replayPlotFunctions to save the current plot in an R variable andto replay it.
rgbCreates colors corresponding to the given intensities(between 0 and max) of thered, green, and blue primaries.
rgb2hsvTransforms colors from RGB space (red/green/blue) intoHSV space (hue/saturation/value).
savePlotSaves the current page of a Cairo X11() device to afile.
setEPSA wrapper to ps.options that sets defaults appropriatefor figures for inclusion in documents (the default size is 7inches square unless width or height is supplied).
setPSA wrapper to ps.options to set defaults appropriate forfigures for spooling to a PostScript printer.
svgCreates a new graphics device for outputting graphicsin Scalable Vector Graphics (SVG) format.
terrain.colorsCreates a vector of n contiguous colors.
tiffCreates a new graphics device for outputting graphicsin Tagged Image File Format (TIFF) format.
topo.colorsCreates a vector of n contiguous colors.
trans3dProjection of three-dimensional to two-dimensionalpoints using a 4 × 4 viewing transformation matrix.
x11A synonym for X11 (which opens a new X11 device forplotting graphics).
xfigStarts the graphics device driver for producing XFig(version 3.2) graphics.
xy.coordsUsed by many functions to obtain x and y coordinatesfor plotting. The use of this common mechanism across allrelevant R functions produces a measure ofconsistency.
xyTableGiven (x, y) points, determines theirmultiplicity—checking for equality only up to some (crude kindof) noise. Note that this is a special kind of 2Dbinning.
xyz.coordsUtility for obtaining consistent x, y, and zcoordinates and labels for three-dimensional (3D)plots.

Data Sets

Data SetClassDescription
HersheylistIf the familygraphical parameter (see par) has been set to one of theHershey fonts, Hershey vector fonts are used to render text.When using the text andcontour functions, Hersheyfonts may be selected via the vfont argument, which is a charactervector of length 2. This allows Cyrillic to be selected, whichis not available via the font families.
blues9characterdensCols produces avector containing colors that encode the local densities ateach point in a scatter plot.
colorspaceslistConverts colors between standard color spacerepresentations. This function is experimental.

graphics

This package contains functions for base graphics. Basegraphics are traditional S graphics, as opposed to the newer gridgraphics.

Functions

FunctionDescription
AxisGeneric function to add a suitable axis to the currentplot.
ablineAdds one or more straight lines through the currentplot.
arrowsDraws arrows between pairs of points.
assocplotProduces a Cohen-Friendly association plot indicatingdeviations from independence of rows and columns in atwo-dimensional contingency table.
axTicksComputes pretty tick mark locations, the same way as Rdoes internally. This is only nontrivial whenlog coordinates are active. By default, gives the at values thataxis(side) woulduse.
axisAdds an axis to the current plot, allowing thespecification of the side, position, labels, and otheroptions.
barplotCreates a bar plot with vertical or horizontalbars.
boxDraws a box around the current plot in a given colorand line type. The btyparameter determines the type of box drawn. See par for details.
boxplotProduces box-and-whisker plot(s) of the given (grouped)values.
boxplot.matrixInterprets the columns (or rows) of a matrix asdifferent groups and draws a box plot for each.
bxpDraws box plots based on the given summaries inz. It is usually calledfrom within boxplot, butcan be invoked directly.
cdplotComputes and plots conditional densities describing howthe conditional distribution of a categorical variabley changes over a numericvariable x.
clipSets clipping region in user coordinates.
close.screenRemoves the specified screen definition(s) created bysplit.screen.
co.intervalsProduces two variants of theconditioning plots.
contourCreates a contour plot or adds contour lines to anexisting plot. Methods include contour.default.
coplotProduces two variants of theconditioning plots.
curveDraws a curve corresponding to a given function or, forcurve(), also an expression(in x) over the interval[from, to].
dotchartDraws a Cleveland dot plot.
erase.screenUsed to clear a single screen (when using split.screen), which it does byfilling with the background color.
filled.contourProduces a contour plot with the areas between thecontours filled in solid color (Cleveland calls this a levelplot).
fourfoldplotCreates a fourfold display of a2-by-2-by-k contingency table on thecurrent graphics device, allowing for the visual inspection ofthe association between two dichotomous variables in one orseveral populations (strata).
frameThis function (frameis an alias for plot.new)causes the completion of plotting in the current plot (ifthere is one) and an advance to a new graphics frame.
grconvertX, grconvertYConvert between graphics coordinate systems.
gridAdds an nx-by-ny rectangular grid to an existingplot.
histThe generic function hist computes a histogram of thegiven data values. If plot=TRUE, the resulting object of\link[base]{class"histogram"} is plotted by plot.histogram, before it isreturned. Methods include hist.default.
identifyReads the position of the graphics pointer when the(first) mouse button is pressed. It then searches thecoordinates given in x andy for the point closest tothe pointer. If this point is close enough to the pointer, itsindex will be returned as part of the value of thecall.
imageCreates a grid of colored or grayscale rectangles withcolors corresponding to the values in z.
layout, layout.showlayout divides thedevice up into as many rows and columns as there are in matrixmat, with the column widthsand the row heights specified in the respective arguments.
lcmlayout divides thedevice up into as many rows and columns as there are in matrixmat, with the column widthsand the row heights specified in the respective arguments.
legendUsed to add legends to plots. Note that a call to thefunction locator(1) can beused in place of the x andy arguments.
linesA generic function taking coordinates given in variousways and joining the corresponding points with line segments.Methods include lines.default and lines.ts.
locatorReads the position of the graphics cursor when the(first) mouse button is pressed.
matlines, matplot, matpointsPlot the columns of one matrix against the columns ofanother.
mosaicplotPlots a mosaic on the current graphics device.
mtextText is written in one of the four margins of thecurrent figure region or one of the outer margins of thedevice region.
pairsA matrix of scatter plots is produced.
panel.smoothAn example of a simple useful panel function to be used as anargument in, e.g., coplotor pairs.
parUsed to set or query graphical parameters.
perspDraws perspective plots of surfaces over thexy plane.
pieDraws a pie chart.
plotGeneric function for plotting R objects.
plot.designPlots univariate effects of one or more factors, typically for a designedexperiment as analyzed by aov().
plot.newThis function (plot.new is an alias for frame) causes the completion ofplotting in the current plot (if there is one) and an advanceto a new graphics frame.
plot.windowSets up the world coordinate system for a graphicswindow. It is called by higher-level functions such asplot.default(after plot.new).
plot.xyThis is the internal function thatdoes the basic plotting of points and lines. Usually, oneshould rather use the higher-level functions instead and referto their help pages for explanation of the arguments.
pointsA generic function to draw a sequence of points at thespecified coordinates. The specified character(s) are plotted, centered at the coordinates. Methods include points.default.
polygonDraws the polygons whose vertices are given in x and y.
rectDraws a rectangle (or sequence of rectangles) with thegiven coordinates, fill, and border colors.
rugAdds a rug representation (1Dplot) of the data to the plot.
screenUsed to select which screen to draw in (when usingsplit.screen).
segmentsDraws line segments between pairs of points.
smoothScatterProduces a smoothed color density representation of thescatter plot, obtained through a kernel densityestimate.
spineplotSpine plots are a special case of mosaic plots and canbe seen as a generalization of stacked (or highlighted) barplots. Analogously, spinograms are an extension ofhistograms.
split.screenDefines a number of regions within the current devicethat can, to some extent, be treated as separate graphicsdevices. It is useful for generating multiple plots on asingle device.
starsDraws star plots or segment diagrams of a multivariatedata set. With one single location, also draws "spider" (or"radar") plots.
stemProduces a stem-and-leaf plot of the values in x.
strheightComputes the height of the given strings ormathematical expressions s[i] on the current plotting devicein user coordinates, inches, or as a fraction of the figurewidth par("fin").
stripchartProduces one-dimensional scatter plots (or dot plots)of the given data. These plots are a good alternative to boxplots when sample sizes are small.
strwidthComputes the width of the given strings or mathematicalexpressions s[i] on thecurrent plotting devicein user coordinates, inches, or as a fraction of the figurewidth par("fin").
sunflowerplotMultiple points are plotted as "sunflowers" withmultiple leaves ("petals") such that overplotting isvisualized instead of accidental and invisible.
symbolsDraws symbols on a plot. One of six symbols, circles, squares, rectangles, stars, thermometers, and boxplots, can be plotted at a specified set of x and ycoordinates.
textDraws the strings given in the vector labels at the coordinates given byx and y. y may be missing since xy.coords(x, y) is used forconstruction of the coordinates.
titleUsed to add labels to a plot.
xinch, xyinch, yincchxinch and yinch convert the specified numberof inches given as their arguments into the correct units forplotting with graphics functions. Usually, this only makessense when normal coordinates are used, i.e., no logscale (see the log argumentto par). xyinch does the same for a pair ofnumbers xy, simultaneously.
xsplineDraws an X-spline, a curve drawn relative to controlpoints.

grid

This package is a low-level graphics system that providesa great deal of control and flexibility in the appearance and arrangementof graphical output. It does not provide high-level functions thatcreate complete plots. What it does provide is a basis for developingsuch high-level functions (e.g., the lattice package), the facilities forcustomizing and manipulating lattice output, the ability to producehigh-level plots or non-statistical images from scratch, and the abilityto add sophisticated annotations to the output from base graphicsfunctions (see the gridBase package).For more information, see the help files for grid.

KernSmooth

This package provides functions for kernelsmoothing.

Functions

FunctionDescription
bkdeReturns x andy coordinates of the binned kerneldensity estimate of the probability density of thedata.
bkde2DReturns the set of grid points in each coordinatedirection, and the matrix of density estimates over the meshinduced by the grid points. The kernel is the standardbivariate normal density.
bkfeReturns an estimate of a binned approximation to thekernel estimate of the specified density function. The kernelis the standard normal density.
dpihUses direct plug-in methodology to select the bin widthof a histogram.
dpikUses direct plug-in methodology to select the bandwidthof a kernel density estimate.
dpillUses direct plug-in methodology to select the bandwidthof a local linear Gaussian kernel regression estimate.
locpolyEstimates a probability density function, regressionfunction, or their derivatives using local polynomials. A fastbinned implementation over an equally spaced grid isused.

lattice

Trellis graphics is a framework for data visualizationdeveloped at Bell Labs by Richard Becker, William Cleveland, et al., extending ideas presented inBill Cleveland's 1993 book Visualizing Data.

Lattice is best thought of as an implementation of Trellisgraphics for R. It is built upon the grid graphics engine and requiresthe grid add-on package. It is not (readily) compatible with traditionalR graphics tools. The public interface is based on the implementation inS-PLUS, but features several extensions, in addition toincompatibilities introduced through the use of grid. To the extentpossible, care has been taken to ensure that existing Trellis codewritten for S-PLUS works unchanged (or with minimal change) in lattice.If you are having problems porting S-PLUS code, read the entry for panelin the documentation for xyplot. Mosthigh-level Trellis functions in S-PLUS are implemented, with theexception of piechart.

Functions

FunctionDescription
RowsConvenience function to extract a subset of a list.Usually used in creating keys.
as.shingle, as.factorOrShingleFunctions to handle shingles.
axis.defaultDefault function for drawing axes in latticeplots.
bankingCalculates banking slope.
barchartDraws bar charts.
bwplotDraws box plots.
canonical.themeInitialization of a display device with appropriategraphical parameters.
cloudGeneric function to draw 3D scatter plots and surfaces.The "formula" methods domost of the actual work.
col.whitebgInitialization of a display device with appropriategraphical parameters.
contourplotDraws level plots and contour plots.
current.columnReturns an integer index specifying which column in thelayout is currently active.
current.panel.limitsUsed to retrieve a panel's x and y limits.
current.rowReturns an integer index specifying which row in thelayout is currently active.
densityplotDraws histograms and kernel density plots, possiblyconditioned on other variables.
diag.panel.splomThis is the default superpanel function for splom.
do.breaksDraws histograms and kernel density plots, possiblyconditioned on other variables.
dotplotDraws Cleveland dot plots.
draw.colorkeyProduces (and possibly draws) a grid frame grob, whichis a color key that can be placed in other grid plots. Used inlevelplot.
draw.keyProduces (and possibly draws) a grid frame grob, whichis a legend (aka key) that can be placed in other gridplots.
equal.countFunction to handle shingles.
histogramDraws histograms and kernel density plots, possiblyconditioned on other variables.
is.shingleFunction to handle shingles.
larrows, llines, lplot.xy, lpoints, lpolygon, lrect, lsegments, ltext, panel.points, panel.polygon, panel.rect, panel.segments, panel.textThese functions are intended to replace commonlow-level traditional graphics functions, primarily for use inpanel functions. The originals cannot be used (at least noteasily) because lattice panel functions need to use gridgraphics. Low-level drawing functions in grid can be useddirectly as well and are often more flexible. These functionsare provided for convenience and portability.
lattice.getOption, lattice.optionsFunctions to handle settings used by lattice. Theirmain purpose is to make code maintenance easier, and usersnormally should not need to use these functions. However, finecontrol at this level may be useful in certain cases.
latticeParseFormulaUsed by high-level lattice functions like xyplot to parse the formula argumentand evaluate various components of the data.
level.colorsCalculates false colors from a numeric variable(including factors, using their numeric codes) given a colorscheme and break points.
levelplotDraws level plots and contour plots.
ltransform3dMatrix, ltransform3dto3dThese are (related to) the default panel functions forcloud and wireframe.
make.groupsCombines two or more vectors, possibly of differentlengths, producing a data frame with a second columnindicating which of these vectors that row came from. This ismostly useful for getting data into a form suitable for use inhigh-level lattice functions.
onewayFits a one-way model to univariate data grouped by afactor, the result often being displayed using rfs.
packet.numberA function that identifies which packet eachobservation in the data is part of.
packet.panel.defaultDefault function in lattice to determine, given thecolumn, row, page, and other relevant information, the packet(if any) that should be used in a panel.
panel.3dscatter, panelDefault panel functions controlling cloud and wireframe displays.
panel.ablineAdds a line of the form y = a + bxor vertical and/or horizontal lines.
panel.averageTreats one of x andy as a factor (according to the value ofhorizontal), calculates fun applied to the subsets of theother variable determined by each unique value of the factor, and joins them by a line.
panel.arrowsDraws arrows in a panel.
panel.axisThe function used by lattice to draw axes. It istypically not used by users, except those wishing to createadvanced annotation. Keep in mind issues of clipping whentrying to use it as part of the panel function. current.panel.limits can be used toretrieve a panel's x andy limits.
panel.barchartDefault panel function for barchart.
panel.brush.splompanel.link.splom ismeant for use with splom and requires a panel to be chosenusing trellis.focus beforeit is called. Clicking on a point causes that and thecorresponding projections in other pairwise scatter plots tobe highlighted.
panel.bwplotDefault panel function for bwplot.
panel.cloudDefault panel function controlling cloud and wireframe displays.
panel.contourplotDefault panel function for levelplot.
panel.curveAdds a curve, similar to what curve does with add=TRUE. Graphical parameters forthe line are obtained from the add.line setting.
panel.densityplotDefault panel function for densityplot.
panel.dotplotDefault panel function for dotplot.
panel.errorDefault handler used when an error occurs whileexecuting a panel function.
panel.fillFills the panel with a specified color.
panel.gridDraws a reference grid.
panel.histogramDefault panel function for histogram.
panel.identifySimilar to identify. When called, it waits for the userto identify points (in the panel being drawn) via mouseclicks.
panel.levelplotDefault panel function for levelplot.
panel.linejoinpanel.linejoin is analias for panel.averagethat was retained for back-compatibility and may go away inthe future.
panel.linesPlots lines in a panel.
panel.link.splomThe classic Trellis paradigm is to plot the wholeobject at once, without the possibility of interacting with itafterward. However, by keeping track of the grid viewportswhere the panels and strips are drawn, it is possible to goback to them afterward and enhance them one panel at a time.This function provides convenient interfaces to help in this.Note that this is still experimental and the exact details maychange in the future.
panel.lmlinepanel.lmline(x, y)is equivalent to panel.abline(lm(y~x)).
panel.loessAdds a smooth curve (fitted by loess).
panel.mathdensityPlots a (usually theoretical) probability densityfunction.
panel.numberReturns an integer counting which panel is being drawn(starting from 1 for the first panel, aka the panelorder).
panel.pairsDefault superpanel function for splom.
panel.parallelDefault panel function for parallel.
panel.qqDefault panel function for qq.
panel.qqmathDefault panel function for qqmath.
panel.qqmathlineUseful panel function with qqmath. Draws a line passing throughthe points (usually) determined by the .25 and .75 quantilesof the sample and the theoretical distribution.
panel.reflineSimilar to panel.abline, but uses the"reference.line" settings for the defaults.
panel.rugAdds a rug representation of the (marginal) data to thepanel.
panel.smoothScatterAllows the user to place smoothScatter plots in latticegraphics.
panel.splomDefault panel function for splom.
panel.stripplotDefault panel function for stripplot. Also see panel.superpose.
panel.superpose, panel.superpose.2These are panel functions for Trellis displays, whichare useful when a grouping variable is specified for usewithin panels. The x (andy where appropriate)variables are plotted with different graphical parameters foreach distinct value of the grouping variable.
panel.tmd.default, panel.tmd.qqmathDefault panel functions for tmd.
panel.violinThis is a panel function that can create a violin plot.It is typically used in a high-level call to bwplot.
panel.wireframeDefault panel functions controlling cloud and wireframe displays.
panel.xyplotDefault panel function for xyplot.
parallelDraws conditional scatter plot matrices and parallelcoordinate plots.
prepanel.default.bwplot, prepanel.default.cloud, prepanel.default.densityplot, prepanel.default.histogram, prepanel.default.levelplot, prepanel.default.parallel, prepanel.default.qq, prepanel.default.qqmath, prepanel.default.splom, prepanel.default.xyplotThese prepanel functions are used as fallback defaultsin various high-level plot functions in lattice. These arerarely useful to normal users, but may be helpful indeveloping new displays.
prepanel.lmline, prepanel.loess, prepanel.qqmathlineThese are predefined prepanel functions available inlattice.
prepanel.tmd.default, prepanel.tmd.qqmathtmd creates Tukeymean-difference plots from a trellis object returned by xyplot, qq, or qqmath. The prepanel and panelfunctions are used as appropriate. The formula method for tmd is provided for convenience andsimply calls tmd on theobject created by calling xyplot on that formula.
qqQuantile-quantile plots for comparing twodistributions.
qqmathQuantile-quantile plot of a sample and a theoreticaldistribution.
rfsPlots fitted values and residuals (via qqmath) on a common scale for anyobject that has methods for fitted values andresiduals.
shingleFunction to handle shingle.
show.settingsFunction used to query, display, and modify graphicalparameters for fine control of Trellis displays. Modificationsare made to the settings for the currently active deviceonly.
simpleKeySimple interface to generate a list appropriate fordraw.key.
simpleThemeSimple interface to generate a list appropriate as atheme, typically used as the par.settings argument in ahigh-level call.
splomDraws conditional scatter plot matrices and parallelcoordinate plots.
standard.themeInitialization of a display device with appropriategraphical parameters.
strip.customProvides a convenient way to obtain new strip functionsthat differ from strip.default only in the defaultvalues of certain arguments.
strip.defaultFunction that draws the strips by default in Trellisplots. Users can write their own strip functions, but mostcommonly this involves calling strip.default with slightlydifferent arguments.
stripplotDraws strip plots in lattice.
tmdtmd creates Tukeymean-difference plots from a trellis object returned by xyplot, qq, or qqmath. The formula method for tmd is provided for convenience andsimply calls tmd on theobject created by calling xyplot on that formula.
trellis.currentLayoutReturns a matrix with as many rows and columns as inthe layout of panels in the current plot.
trellis.deviceInitialization of a display device with appropriategraphical parameters.
trellis.focus, trellis.grobnametrellis.focus can beused to move to a particular panel or strip, identified by itsposition in the array of panels.
trellis.last.objectUpdates method for objects of class "trellis" and is a way to retrievethe last printed trellisobject (that was saved).
trellis.panelArgsOnce a panel or strip is in focus (e.g., by usingtrellis.switchFocus), trellis.panelArgs can beused to retrieve the arguments that were available to thepanel function at thatposition.
trellis.par.get, trellis.par.setFunctions used to query, display, and modify graphicalparameters for fine control of Trellis displays. Modificationsare made to the settings for the currently active deviceonly.
trellis.switchFocusA convenience function to switch from one viewport toanother, while preserving the current row and column.
trellis.unfocusUnsets the focus and makes the top-level viewport thecurrent viewport.
trellis.vpnameReturns the name of a viewport.
which.packetReturns the combination of levels of the conditioningvariables in the form of a numeric vector as long as thenumber of conditioning variables, with each element an integerindexing the levels of the corresponding variable.
wireframeGeneric function to draw 3D scatter plots and surfaces.The "formula" methods domost of the actual work.
xscale.components.default, yscale.components.defaultReturn a list of the form suitable as the componentsargument of axis.default.
xyplotProduces conditional scatter plots.

Data Sets

Data SetClassDescription
barleydata.frameTotal yield in bushels per acre for 10 varieties at 6sites in each of 2 years.
environmentaldata.frameDaily measurements of ozone concentration, wind speed, temperature, and solar radiation in New York City from May toSeptember of 1973.
ethanoldata.frameEthanol fuel was burned in a single-cylinder engine.For various settings of the engine compression and equivalenceratio, the emissions of nitrogen oxides were recorded.
melanomadata.frameThis data from the Connecticut Tumor Registry presentsage-adjusted numbers of melanoma skin cancer incidences per100, 000 people in Connecticut for the years 1936–1972.
singerdata.frameHeights, in inches, of the singers in the New YorkChoral Society in 1979. The data is grouped according to voicepart. The vocal range for each voice part increases in pitchaccording to the following order: Bass 2, Bass 1, Tenor 2, Tenor 1, Alto 2, Alto 1, Soprano 2, Soprano 1.

Pages: 1, 2, 3, 4, 5

Next Pagearrow