Monday, June 17, 2013

getGEO return a list

> Dear All,
>
> I am trying to analyze a Microarray series data using "GEOquery". I
> get the below
> error when trying to extract the information.
>
> > gse = getGEO("GSE15709");
> > Meta(gse)
> Error in function (classes, fdef, mtable)  :
>  unable to find an inherited method for function "Meta", for signature
> "list"
> > GSMList(gse)[[1]]
> Error in function (classes, fdef, mtable)  :
>  unable to find an inherited method for function "GSMList", for
> signature "list"
>
> Thanks in advance for your help.
>
>
getGEO() defaults to GSEMatrix=TRUE.  Per the documentation, this returns a
list of ExpressionSet objects.  So, gse[[1]] is an ExpressionSet.  The
accessors that you are using above are for returns from calls to getGEO()
with GSEMatrix=FALSE.  Unless there is a specific need to get information
from the full GSE SOFT file, I would recommend staying with GSEMatrix=TRUE
and working with the more standard ExpressionSet object(s).

Hope that helps.

Sean


#########################
B = as(gse[[1]],"ExpressionSet")

then good to go.

Tuesday, June 11, 2013

RMA over a data.frame or a matrix

> Dear List,
>  
> I was wondering if RMA function can be applied to a data frame or a matrix??
> Thanks.
>  
> -Sohail


Hi,

If your matrix contains the intensity data from actual microarrays,
this is a function you could use (or leverage):

#
# intensity and intensitySD are objects equivalent to matrices.
#   They should contain probe intensities in rows and samples in
#   columns. Probe intensities should be ordered the same way they
#   are in a .CEL file.
# chipType is a string containing the chip type (see function
#   cleancdfname()).
# chipSizeX and chipSizeY specify the number of features on the chip.
#
CreateAffyBatch <- function( intensity, intensitySD, chipType, chipSizeX, chipSizeY )
{
    exprs <- data.matrix( intensity )
    se.exprs <- data.matrix( intensitySD )

    sampleIndices <- 1 : ncol( exprs )
    sampleNames <- paste( "sample", sampleIndices, sep = "" )

    colnames( exprs ) <- sampleNames
    colnames( se.exprs ) <- sampleNames

    phenoData.data <- as.data.frame( sampleIndices )
    rownames( phenoData.data ) <- sampleNames
    colnames( phenoData.data ) <- "sample"
    
    if( length( findClass( "AnnotatedDataFrame" ) ) == 1 )
    {
        phenoData <- new( "AnnotatedDataFrame", data = phenoData.data )
        phenoData at varMetadata[[1]] <- "arbitrary numbering"
    } else {
        phenoData <- new( "phenoData", pData = phenoData.data, varLabels = list( sample = "arbitrary numbering" ) )
    }

    affyBatch <- new( "AffyBatch", exprs = exprs, se.exprs = se.exprs,
        cdfName = chipType, annotation = cleancdfname( chipType, addcdf = FALSE ),
        ncol = chipSizeX, nrow = chipSizeY, phenoData = phenoData )

    return( affyBatch )
}

Friday, May 10, 2013

Solving the Transcendental equation by matlab.

example:

TO solve:

cos(x)*cosh(x)+1=0


Matlab commands:

f = @(x) cos(x)*cosh(x) + 1;
fzero(f,2)


reference:

http://www.mathworks.com/help/matlab/ref/fzero.html

Thursday, April 25, 2013

Installed postfix, but no qshape? (CC)


Here’s a simple one…
I installed a new FC12, and postfix, using “yum install postfix”. Afterward, after configuring main.cf, postfix ran fine through our simple testing, so I put it in service on a limited basis. However, once we started using it, qshape failed, with an error indicating it was not found.
It took me a couple of hours of my poor google skills to find the answer, so hopefully, if you find yourself in the same pickle, you can use this and it will help you.
First, separately install all the perl packages with this command:
“yum groupinstall perl development”  once you have that all done, (and here’s the magic) run:
“yum install postfix-perl-scripts”.
I know, qshape is *supposed* to be installed with postfix.  Only it wasn’t. and it took me all morning to figure out how to get it in there….
…..It worked for me.

Wednesday, April 24, 2013

Blog for Linux System Administrators. : Sendmail vs Postfix vs Qmail vs Exim

Blog for Linux System Administrators. : Sendmail vs Postfix vs Qmail vs Exim: We have choice in using MTA in linux. We can use sendmail, postfix, qmail or exim. The selection of MTA depends on many factor such as follo...

Thursday, April 11, 2013

Process NIH GEO GSE data by geoQuery

How to get an Expression value table of a GSE* file from GEO website.

for example: GSE33147

>g = getGEO("GSE33147")

......

it may download a series data matrix file: GSE33147_series_matrix.txt.gz
then load the dat again:
>g = getGEO(filename="GSE33147_series_matrix.txt.gz")

check the data

>class(g)

get the ExpressionSet:

> e = as(g, "ExpressionSet")

get the data table
> f = exprs(e)

save:
> write.csv(f, file="***")

load the group gene names that you want: (assume you only want part of them)
the names are stored in file "top60.csv"
>genes = read.csv("top60.csv",header=T)

the genes are factors, we need change them to character,
> cgenes = as.character(genes[,1])               //the first column.

>

Wednesday, April 10, 2013

Processing Microarray data by using R and Bioinductor

Load all the .CEL.gz file from a folder:

>library(affy)
>data <- ReadAffy()

## For Affymetrix data, there is no concept of RAW data.
See page 72 of DNA microarray data analysis using Bioconductor.

Now get the RMA data
> data.rma <- rma(data)

or you can use the following if the size of data is too large.

>data.rma <- justRMA(data)


change the RMA result to expression values
data.e <- exprs(data.rma)

save the data.e
write.csv(data.e,file="data.csv")

##now process the data and map probe ID to gene ID
##by any script language ....such as python


now load the processed data to R again:
d <- read.table("processed_data.csv",header=T,sep=",")

now d is a string matrix, make it to float

df <- data.frame(d,row.names=1)    ## use the first column as name

calculate the mean and append to df

df$mean <- rowMeans(df)

ranking

dfr <- df[order(-df$mean),]

save the highest 60 to a .csv file

write.csv(dfs[1:60,],file="d60.csv",sep=",")