From g.barton at imperial.ac.uk Thu Aug 2 13:32:29 2007 From: g.barton at imperial.ac.uk (Barton, Geraint R) Date: Thu, 2 Aug 2007 14:32:29 +0100 Subject: [Plr-general] serializing dataframes Message-ID: <60BD7AC787479D47A3D13E42D836101D024F4106@icex4.ic.ac.uk> dear plr people, I am having a problem getting the results of serialized data frame from from plr to the db. Instead of the serailizing function returning the whole data frame it just returns the first 2 characters of the serialized text object. we have recently moved to a new server and have installed the up to date versions of various of various bits of software and all seem to be working well together apart from getting serialized results into the db: on the new server: plr v8.2.0.5 postgres v8.2.4 R v 2.5.1 the serialization was working on our old server postgres v 8.0.1, R v2.2.0 plr v (older than v8.2.05) On the new sever this is the result of the serialize_df function serialize_df -------------- 41 (1 row) on the old server the serialized dataframe is spat out to out as it is being serialized. The serialize funtion: CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS text AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; ALTER FUNCTION serialize_df(text) OWNER TO gbarton; The serialize function works in R seperatley, so I do not believe that R is at fault. Also when the serialize_df(text) function is called from psql it appears to work and takes about the same amount of time as it would do in R, but it jusr returns the 2 characters as above. I think it could be something to do with the way that either plr or postgres v8.2 is returing long text strings. Where has my serialized object gone??? How can I serialize it to the db? Has any else seen similar behaviour with serializing objects? Any advice on how to sort this out I did try and have a look at the plr archives, but can't seem to find anything before june 2007 now, where'd the archives go? thanks geraint -------------- next part -------------- An HTML attachment was scrubbed... URL: http://pgfoundry.org/pipermail/plr-general/attachments/20070802/097cb0b5/attachment.html From mail at joeconway.com Fri Aug 3 04:34:25 2007 From: mail at joeconway.com (Joe Conway) Date: Thu, 02 Aug 2007 21:34:25 -0700 Subject: [Plr-general] serializing dataframes In-Reply-To: <60BD7AC787479D47A3D13E42D836101D024F4106@icex4.ic.ac.uk> References: <60BD7AC787479D47A3D13E42D836101D024F4106@icex4.ic.ac.uk> Message-ID: <46B2B051.6010707@joeconway.com> Barton, Geraint R wrote: > dear plr people, > > I am having a problem getting the results of serialized data frame from > from plr to the db. Instead of the serailizing function returning the > whole data frame it just returns the first 2 characters of the > serialized text object. > > we have recently moved to a new server and have installed the up to date > versions of various of various bits of software and all seem to be > working well together apart from getting serialized results into the db: > on the new server: > plr v8.2.0.5 > postgres v8.2.4 > R v 2.5.1 > > the serialization was working on our old server > postgres v 8.0.1, > R v2.2.0 > plr v (older than v8.2.05) > > On the new sever this is the result of the serialize_df function > serialize_df > -------------- > 41 > (1 row) > on the old server the serialized dataframe is spat out to out as it is > being serialized. Looks like a change in R behavior -- from the release notes: CHANGES IN R VERSION 2.4.1 patched [...] serialize(connection = NULL) now returns a raw vector (and not a character string). unserialize() accepts both old and new formats (and has since 2.3.0). [...] If you change your function like this: 8<------------------------------------------ CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS setof text AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; 8<------------------------------------------ and call it like this: 8<------------------------------------------ select * from serialize_df('df'); 8<------------------------------------------ you'll see the entire vector as a series of rows. Or perhaps better would be this, which returns a text array: 8<------------------------------------------ CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS text[] AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; select serialize_df('df'); 8<------------------------------------------ HTH, Joe From g.barton at imperial.ac.uk Fri Aug 3 10:47:43 2007 From: g.barton at imperial.ac.uk (Barton, Geraint R) Date: Fri, 3 Aug 2007 11:47:43 +0100 Subject: [Plr-general] serializing dataframes In-Reply-To: <46B2B051.6010707@joeconway.com> References: <60BD7AC787479D47A3D13E42D836101D024F4106@icex4.ic.ac.uk> <46B2B051.6010707@joeconway.com> Message-ID: <60BD7AC787479D47A3D13E42D836101D024F4131@icex4.ic.ac.uk> Thanks Joe, I am using your second method 8<------------------------------------------ CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS text[] AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; select serialize_df('df'); 8<------------------------------------------ But the problem I am having is unserialzing the dataframe back into R. I have ammended my unserialize_df function to accept the text[] 8<---------------------------------------------------- CREATE OR REPLACE FUNCTION unserialize_df(text[], integer, text) RETURNS integer AS $BODY$ dataset<-unserialize(arg1) assign(paste(arg3,"_",arg2,sep=""),dataset,env=.GlobalEnv) return(1)$BODY$ LANGUAGE 'plr' VOLATILE; 8<------------------------------------------------------ But when I call this now it seems that R does not like the input type text[]. microarray=# SELECT unserialize_df('{df_ser2}', 101,'expdes') FROM expdes WHERE id = 404; ERROR: R interpreter expression evaluation error DETAIL: Error in unserialize(arg1) : unknown input format CONTEXT: In PL/R function unserialize_df I have tried playing around in R trying to change the df_ser into an object that unserialize recognizes eg charToRaw, but not having any luck yet. Any ideas? Thanks geraint -----Original Message----- From: Joe Conway [mailto:mail at joeconway.com] Sent: 03 August 2007 05:34 To: Barton, Geraint R Cc: plr-general at pgfoundry.org Subject: Re: [Plr-general] serializing dataframes Barton, Geraint R wrote: > dear plr people, > > I am having a problem getting the results of serialized data frame > from from plr to the db. Instead of the serailizing function returning > the whole data frame it just returns the first 2 characters of the > serialized text object. > > we have recently moved to a new server and have installed the up to > date versions of various of various bits of software and all seem to > be working well together apart from getting serialized results into the db: > on the new server: > plr v8.2.0.5 > postgres v8.2.4 > R v 2.5.1 > > the serialization was working on our old server postgres v 8.0.1, R > v2.2.0 plr v (older than v8.2.05) > > On the new sever this is the result of the serialize_df function > serialize_df > -------------- > 41 > (1 row) > on the old server the serialized dataframe is spat out to out as it is > being serialized. Looks like a change in R behavior -- from the release notes: CHANGES IN R VERSION 2.4.1 patched [...] serialize(connection = NULL) now returns a raw vector (and not a character string). unserialize() accepts both old and new formats (and has since 2.3.0). [...] If you change your function like this: 8<------------------------------------------ CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS setof text AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; 8<------------------------------------------ and call it like this: 8<------------------------------------------ select * from serialize_df('df'); 8<------------------------------------------ you'll see the entire vector as a series of rows. Or perhaps better would be this, which returns a text array: 8<------------------------------------------ CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS text[] AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; select serialize_df('df'); 8<------------------------------------------ HTH, Joe From mail at joeconway.com Fri Aug 3 14:50:27 2007 From: mail at joeconway.com (Joe Conway) Date: Fri, 03 Aug 2007 07:50:27 -0700 Subject: [Plr-general] serializing dataframes In-Reply-To: <60BD7AC787479D47A3D13E42D836101D024F4131@icex4.ic.ac.uk> References: <60BD7AC787479D47A3D13E42D836101D024F4106@icex4.ic.ac.uk> <46B2B051.6010707@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4131@icex4.ic.ac.uk> Message-ID: <46B340B3.4080300@joeconway.com> Barton, Geraint R wrote: > > But when I call this now it seems that R does not like the input type > text[]. > microarray=# SELECT unserialize_df('{df_ser2}', 101,'expdes') FROM > expdes WHERE id = 404; > ERROR: R interpreter expression evaluation error > DETAIL: Error in unserialize(arg1) : unknown input format > CONTEXT: In PL/R function unserialize_df > > I have tried playing around in R trying to change the df_ser into an > object that unserialize recognizes eg charToRaw, but not having any luck > yet. Any ideas? Please post a complete example. It is difficult for me to understand what exactly you are doing. For one thing, what is '{df_ser2}'? Are you trying to cast that as an array? Is df_ser2 the field containing the returned array from the serialize function? Joe From g.barton at imperial.ac.uk Fri Aug 3 15:09:21 2007 From: g.barton at imperial.ac.uk (Barton, Geraint R) Date: Fri, 3 Aug 2007 16:09:21 +0100 Subject: [Plr-general] serializing dataframes In-Reply-To: <46B340B3.4080300@joeconway.com> References: <60BD7AC787479D47A3D13E42D836101D024F4106@icex4.ic.ac.uk> <46B2B051.6010707@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4131@icex4.ic.ac.uk> <46B340B3.4080300@joeconway.com> Message-ID: <60BD7AC787479D47A3D13E42D836101D024F4141@icex4.ic.ac.uk> In general what I am doing is using plr to analyse and store data from microarray experiments, serializing the dataframe objects to the db after each step of analysis and returning them to the plr session as needed. This requires the 'affy' library from the bioconductor site: http://bioconductor.org/packages/2.0/bioc/html/affy.html An example data set can be found in this zip file (there are 6 .CEL data files and 1 'expdes_479.txt' phenoData description file to read the data in): https://fileexchange.imperial.ac.uk/files/f482e9a411/testSerializeData 1) The first step is to read the data into the plr session from the filesystem. The first 2 arguments in the testreadaffy function are the paths to the expdes_479.txt file and to where the .CEL files are on the filesystem respectively and the 3rd is an id for the global variable in the plr session. eg microarray=# select * from testreadaffy('/usr/tmp/uploadedFiles/expdes_479.txt','/usr/tmp/uploadedF iles',404); testreadaffy -------------- 134.5 (1 row) ########################## read in data and create the dataframe############# CREATE OR REPLACE FUNCTION testreadaffy(text, text, integer) RETURNS double precision AS $BODY$ pdata_filePath<-arg1 celfiles_filePath<-arg2 pd = read.AnnotatedDataFrame(arg1,header=TRUE,sep="",row.names=1) myaffyexp = ReadAffy(filenames = rownames(pData(pd)), phenoData = pd,celfile.path=arg2,verbose=TRUE) myHead<-head(exprs(myaffyexp)) firstRow = myHead[1] globalEP<-paste("expdes_",arg3,sep="") assign(globalEP, myaffyexp,env=.GlobalEnv) return(firstRow) $BODY$ LANGUAGE 'plr' VOLATILE; ######################################################### 2) After I have created the dataframe in plr I then save the dataframe to the db. It was to a column called df_ser, type text, but as I am now returning a text[] I have changed the column in the expdes table to df_ser2: microarray=# update expdes set df_ser2 =(SELECT serialize_df('expdes_404')) where id=404; UPDATE 1 ################## serialize function ############################### CREATE OR REPLACE FUNCTION serialize_df(text) RETURNS text[] AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; ##################################################################### 3)Then the next step is to unserialize this df from the db back to the plr session: microarray=# SELECT unserialize_df('{df_ser2}', 101,'expdes') FROM expdes WHERE id = 404; ERROR: R interpreter expression evaluation error DETAIL: Error in unserialize(arg1) : unknown input format CONTEXT: In PL/R function unserialize_df ############### unserialize function ################### CREATE OR REPLACE FUNCTION unserialize_df(text[], integer, text) RETURNS integer AS $BODY$dataset<-unserialize(arg1) assign(paste(arg3,"_",arg2,sep=""),dataset,env=.GlobalEnv) return(1)$BODY$ LANGUAGE 'plr' VOLATILE; ######################################################### Hope this helps, Thanks Geraint -----Original Message----- From: Joe Conway [mailto:mail at joeconway.com] Sent: 03 August 2007 15:50 To: Barton, Geraint R Cc: plr-general at pgfoundry.org Subject: Re: [Plr-general] serializing dataframes Barton, Geraint R wrote: > > But when I call this now it seems that R does not like the input type > text[]. > microarray=# SELECT unserialize_df('{df_ser2}', 101,'expdes') FROM > expdes WHERE id = 404; > ERROR: R interpreter expression evaluation error > DETAIL: Error in unserialize(arg1) : unknown input format > CONTEXT: In PL/R function unserialize_df > > I have tried playing around in R trying to change the df_ser into an > object that unserialize recognizes eg charToRaw, but not having any > luck yet. Any ideas? Please post a complete example. It is difficult for me to understand what exactly you are doing. For one thing, what is '{df_ser2}'? Are you trying to cast that as an array? Is df_ser2 the field containing the returned array from the serialize function? Joe From awitney at sgul.ac.uk Fri Aug 3 15:57:55 2007 From: awitney at sgul.ac.uk (Adam Witney) Date: Fri, 03 Aug 2007 16:57:55 +0100 Subject: [Plr-general] serializing dataframes ... Simple test case In-Reply-To: <60BD7AC787479D47A3D13E42D836101D024F4141@icex4.ic.ac.uk> Message-ID: I have been looking at this a little, I think this is a simple (rather crude) test case of the problem: DROP TABLE test; CREATE TABLE test (id int, data text[]); ------------------------------------------------------------------------ CREATE OR REPLACE FUNCTION create_data(text) RETURNS text AS $BODY$ my_data <- list (121,221,3333) assign(arg1, my_data, env=.GlobalEnv) return(my_data[3]) $BODY$ LANGUAGE 'plr' VOLATILE; ------------------------------------------------------------------------ CREATE OR REPLACE FUNCTION get_data(text, int) RETURNS text AS $BODY$ my_data <- get(arg1, env=.GlobalEnv) return(my_data[arg2]) $BODY$ LANGUAGE 'plr' VOLATILE; ------------------------------------------------------------------------ CREATE OR REPLACE FUNCTION serialize_my(text) RETURNS text[] AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (serialize(dataset, NULL, ascii = TRUE)) $BODY$ LANGUAGE 'plr' VOLATILE; ------------------------------------------------------------------------ CREATE OR REPLACE FUNCTION load_data(text[], text) RETURNS text AS $BODY$ my_data <- unserialize(arg1) assign(arg2, my_data, env=.GlobalEnv) return(my_data[3]) $BODY$ LANGUAGE 'plr' VOLATILE; ------------------------------------------------------------------------ SELECT create_data('my_data'); INSERT INTO test VALUES(1, serialize_my('my_data')); SELECT load_data(data::text[], 'my_data2') FROM test WHERE id = 1; SELECT get_data('my_data2', 2); SELECT get_data('my_data', 2); (the last two lines should produce the same result) The problem is the load_data function fails as the R function unserialize doesn't know what to do with the text[] array Hope that helps adam On 3/8/07 16:09, "Barton, Geraint R" wrote: > > In general what I am doing is using plr to analyse and store data from > microarray experiments, serializing the dataframe objects to the db > after each step of analysis and returning them to the plr session as > needed. > > This requires the 'affy' library from the bioconductor site: > http://bioconductor.org/packages/2.0/bioc/html/affy.html > > An example data set can be found in this zip file (there are 6 .CEL data > files and 1 'expdes_479.txt' phenoData description file to read the data > in): > https://fileexchange.imperial.ac.uk/files/f482e9a411/testSerializeData > > 1) The first step is to read the data into the plr session from the > filesystem. The first 2 arguments in the testreadaffy function are the > paths to the expdes_479.txt file and to where the .CEL files are on the > filesystem respectively and the 3rd is an id for the global variable in > the plr session. > eg > microarray=# select * from > testreadaffy('/usr/tmp/uploadedFiles/expdes_479.txt','/usr/tmp/uploadedF > iles',404); > testreadaffy > -------------- > 134.5 > (1 row) > > ########################## read in data and create the > dataframe############# > CREATE OR REPLACE FUNCTION testreadaffy(text, text, integer) > RETURNS double precision AS > $BODY$ > pdata_filePath<-arg1 > celfiles_filePath<-arg2 > pd = read.AnnotatedDataFrame(arg1,header=TRUE,sep="",row.names=1) > myaffyexp = ReadAffy(filenames = rownames(pData(pd)), phenoData = > pd,celfile.path=arg2,verbose=TRUE) > myHead<-head(exprs(myaffyexp)) > firstRow = myHead[1] > globalEP<-paste("expdes_",arg3,sep="") > assign(globalEP, myaffyexp,env=.GlobalEnv) > return(firstRow) > $BODY$ > LANGUAGE 'plr' VOLATILE; > ######################################################### > > > 2) After I have created the dataframe in plr I then save the dataframe > to the db. It was to a column called df_ser, type text, but as I am now > returning a text[] I have changed the column in the expdes table to > df_ser2: > > microarray=# update expdes set df_ser2 =(SELECT > serialize_df('expdes_404')) where id=404; > UPDATE 1 > > ################## serialize function ############################### > CREATE OR REPLACE FUNCTION serialize_df(text) > RETURNS text[] AS > $BODY$ > dataset <- get(arg1, env=.GlobalEnv) > return (serialize(dataset, NULL, ascii = TRUE)) > $BODY$ > LANGUAGE 'plr' VOLATILE; > ##################################################################### > > > 3)Then the next step is to unserialize this df from the db back to the > plr session: > > microarray=# SELECT unserialize_df('{df_ser2}', 101,'expdes') FROM > expdes WHERE id = 404; > ERROR: R interpreter expression evaluation error > DETAIL: Error in unserialize(arg1) : unknown input format > CONTEXT: In PL/R function unserialize_df > > ############### unserialize function ################### > CREATE OR REPLACE FUNCTION unserialize_df(text[], integer, text) > RETURNS integer AS > $BODY$dataset<-unserialize(arg1) > assign(paste(arg3,"_",arg2,sep=""),dataset,env=.GlobalEnv) > return(1)$BODY$ > LANGUAGE 'plr' VOLATILE; > ######################################################### > > Hope this helps, > > Thanks > > Geraint > > > -----Original Message----- > From: Joe Conway [mailto:mail at joeconway.com] > Sent: 03 August 2007 15:50 > To: Barton, Geraint R > Cc: plr-general at pgfoundry.org > Subject: Re: [Plr-general] serializing dataframes > > Barton, Geraint R wrote: >> >> But when I call this now it seems that R does not like the input type >> text[]. >> microarray=# SELECT unserialize_df('{df_ser2}', 101,'expdes') FROM >> expdes WHERE id = 404; >> ERROR: R interpreter expression evaluation error >> DETAIL: Error in unserialize(arg1) : unknown input format >> CONTEXT: In PL/R function unserialize_df >> >> I have tried playing around in R trying to change the df_ser into an >> object that unserialize recognizes eg charToRaw, but not having any >> luck yet. Any ideas? > > Please post a complete example. It is difficult for me to understand > what exactly you are doing. For one thing, what is '{df_ser2}'? Are you > trying to cast that as an array? Is df_ser2 the field containing the > returned array from the serialize function? > > Joe > _______________________________________________ > Plr-general mailing list > Plr-general at pgfoundry.org > http://pgfoundry.org/mailman/listinfo/plr-general From mail at joeconway.com Fri Aug 3 17:21:04 2007 From: mail at joeconway.com (Joe Conway) Date: Fri, 03 Aug 2007 10:21:04 -0700 Subject: [Plr-general] serializing dataframes ... Simple test case In-Reply-To: References: Message-ID: <46B36400.1040103@joeconway.com> Adam Witney wrote: > I have been looking at this a little, I think this is a simple (rather > crude) test case of the problem: Thanks for the test case. The problem seems to be that in recent releases R has moved to using "Raw" vectors for serialize/unserialize, which I take it is actually just a vector of raw bytes. Also note that the help for serialize/unserialize says these are experimental: Warning: These functions are still experimental. Names, interfaces and values might change in future versions (and the value of 'serialize' was changed for R 2.4.0). '.saveRDS' and '.readRDS' are intended for internal use. But in any case I think I found a working solution. There are rawToChar() and charToRaw() functions. Does the following do what you need? 8<---------------------------------------- DROP TABLE test; CREATE TABLE test (id int, data text); CREATE OR REPLACE FUNCTION create_data(text) RETURNS text AS $BODY$ my_data <- list (121,221,3333) assign(arg1, my_data, env=.GlobalEnv) return(my_data[3]) $BODY$ LANGUAGE 'plr' VOLATILE; CREATE OR REPLACE FUNCTION get_data(text, int) RETURNS text AS $BODY$ my_data <- get(arg1, env=.GlobalEnv) return(my_data[arg2]) $BODY$ LANGUAGE 'plr' VOLATILE; CREATE OR REPLACE FUNCTION serialize_my(text) RETURNS text AS $BODY$ dataset <- get(arg1, env=.GlobalEnv) return (rawToChar(serialize(dataset, NULL, ascii = TRUE))) $BODY$ LANGUAGE 'plr' VOLATILE; CREATE OR REPLACE FUNCTION load_data(text, text) RETURNS text AS $BODY$ my_data <- unserialize(charToRaw(arg1)) assign(arg2, my_data, env=.GlobalEnv) return(my_data[3]) $BODY$ LANGUAGE 'plr' VOLATILE; SELECT create_data('my_data'); INSERT INTO test VALUES(1, serialize_my('my_data')); SELECT load_data(data, 'my_data2') FROM test WHERE id = 1; SELECT get_data('my_data2', 2); SELECT get_data('my_data', 2); 8<---------------------------------------- Joe From awitney at sgul.ac.uk Mon Aug 6 13:05:55 2007 From: awitney at sgul.ac.uk (Adam Witney) Date: Mon, 06 Aug 2007 14:05:55 +0100 Subject: [Plr-general] serializing dataframes ... Simple test case In-Reply-To: <46B36400.1040103@joeconway.com> Message-ID: Hi Joe, Yes that seems to do the trick. The reason this is useful is that some of the data frames we build are quite large and take some time to construct. Once built it is much quicker to save it to the database and then retrieve it later. Thanks again adam On 3/8/07 18:21, "Joe Conway" wrote: > Adam Witney wrote: >> I have been looking at this a little, I think this is a simple (rather >> crude) test case of the problem: > > Thanks for the test case. The problem seems to be that in recent > releases R has moved to using "Raw" vectors for serialize/unserialize, > which I take it is actually just a vector of raw bytes. Also note that > the help for serialize/unserialize says these are experimental: > > Warning: > > These functions are still experimental. Names, interfaces and > values might change in future versions (and the value of > 'serialize' was changed for R 2.4.0). '.saveRDS' and '.readRDS' > are intended for internal use. > > But in any case I think I found a working solution. There are > rawToChar() and charToRaw() functions. Does the following do what you need? > > 8<---------------------------------------- > DROP TABLE test; > CREATE TABLE test (id int, data text); > > CREATE OR REPLACE FUNCTION create_data(text) RETURNS text AS $BODY$ > my_data <- list (121,221,3333) > assign(arg1, my_data, env=.GlobalEnv) > return(my_data[3]) > $BODY$ LANGUAGE 'plr' VOLATILE; > > CREATE OR REPLACE FUNCTION get_data(text, int) RETURNS text AS $BODY$ > my_data <- get(arg1, env=.GlobalEnv) > return(my_data[arg2]) > $BODY$ LANGUAGE 'plr' VOLATILE; > > CREATE OR REPLACE FUNCTION serialize_my(text) RETURNS text AS $BODY$ > dataset <- get(arg1, env=.GlobalEnv) > return (rawToChar(serialize(dataset, NULL, ascii = TRUE))) > $BODY$ LANGUAGE 'plr' VOLATILE; > > CREATE OR REPLACE FUNCTION load_data(text, text) RETURNS text AS $BODY$ > my_data <- unserialize(charToRaw(arg1)) > assign(arg2, my_data, env=.GlobalEnv) > return(my_data[3]) > $BODY$ LANGUAGE 'plr' VOLATILE; > > SELECT create_data('my_data'); > INSERT INTO test VALUES(1, serialize_my('my_data')); > > SELECT load_data(data, 'my_data2') FROM test WHERE id = 1; > > SELECT get_data('my_data2', 2); > SELECT get_data('my_data', 2); > 8<---------------------------------------- > > Joe From g.barton at imperial.ac.uk Mon Aug 6 13:13:03 2007 From: g.barton at imperial.ac.uk (Barton, Geraint R) Date: Mon, 6 Aug 2007 14:13:03 +0100 Subject: [Plr-general] serializing dataframes ... Simple test case In-Reply-To: References: <46B36400.1040103@joeconway.com> Message-ID: <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> Hi, Yes thanks joe and adam. Have just finished testing on some real data and this appears to be working fine. Thanks a lot geraint -----Original Message----- From: Adam Witney [mailto:awitney at sgul.ac.uk] Sent: 06 August 2007 14:06 To: Joe Conway Cc: Barton, Geraint R; PL/R pgfoundry Subject: Re: [Plr-general] serializing dataframes ... Simple test case Hi Joe, Yes that seems to do the trick. The reason this is useful is that some of the data frames we build are quite large and take some time to construct. Once built it is much quicker to save it to the database and then retrieve it later. Thanks again adam On 3/8/07 18:21, "Joe Conway" wrote: > Adam Witney wrote: >> I have been looking at this a little, I think this is a simple >> (rather >> crude) test case of the problem: > > Thanks for the test case. The problem seems to be that in recent > releases R has moved to using "Raw" vectors for serialize/unserialize, > which I take it is actually just a vector of raw bytes. Also note that > the help for serialize/unserialize says these are experimental: > > Warning: > > These functions are still experimental. Names, interfaces and > values might change in future versions (and the value of > 'serialize' was changed for R 2.4.0). '.saveRDS' and '.readRDS' > are intended for internal use. > > But in any case I think I found a working solution. There are > rawToChar() and charToRaw() functions. Does the following do what you need? > > 8<---------------------------------------- > DROP TABLE test; > CREATE TABLE test (id int, data text); > > CREATE OR REPLACE FUNCTION create_data(text) RETURNS text AS $BODY$ > my_data <- list (121,221,3333) > assign(arg1, my_data, env=.GlobalEnv) > return(my_data[3]) > $BODY$ LANGUAGE 'plr' VOLATILE; > > CREATE OR REPLACE FUNCTION get_data(text, int) RETURNS text AS $BODY$ > my_data <- get(arg1, env=.GlobalEnv) > return(my_data[arg2]) > $BODY$ LANGUAGE 'plr' VOLATILE; > > CREATE OR REPLACE FUNCTION serialize_my(text) RETURNS text AS $BODY$ > dataset <- get(arg1, env=.GlobalEnv) > return (rawToChar(serialize(dataset, NULL, ascii = TRUE))) $BODY$ > LANGUAGE 'plr' VOLATILE; > > CREATE OR REPLACE FUNCTION load_data(text, text) RETURNS text AS $BODY$ > my_data <- unserialize(charToRaw(arg1)) > assign(arg2, my_data, env=.GlobalEnv) > return(my_data[3]) > $BODY$ LANGUAGE 'plr' VOLATILE; > > SELECT create_data('my_data'); > INSERT INTO test VALUES(1, serialize_my('my_data')); > > SELECT load_data(data, 'my_data2') FROM test WHERE id = 1; > > SELECT get_data('my_data2', 2); > SELECT get_data('my_data', 2); > 8<---------------------------------------- > > Joe From billet at cebc.cnrs.fr Mon Aug 20 12:35:41 2007 From: billet at cebc.cnrs.fr (Norbert Billet) Date: Mon, 20 Aug 2007 14:35:41 +0200 Subject: [Plr-general] protection stack overflow error In-Reply-To: <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> References: <46B36400.1040103@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> Message-ID: <1187613341.6970.21.camel@PC160> Dear Pl/R users, I'm using Pl/R to read a binary file (with the readFormat() function of the R hexView package) It's a large file (200 MO) with hundreds of thousands of records that I will insert on a postgreSQL database with the same Pl/R script. But my script crash with this message : ERROR: R interpreter expression evaluation error DETAIL: Error: protect(): protection stack overflow CONTEXT: In PL/R function remigeimportgaoprofiles Like Pl/R is a script language for postgreSQL, we can't do a commit periodically. How I can determine the ammount of records that I can import without overflow the stack ? thank you, Norbert ________ Information from NOD32 ________ This message was checked by NOD32 Antivirus System for Linux Mail Servers. http://www.eset.com From mail at joeconway.com Mon Aug 20 13:44:03 2007 From: mail at joeconway.com (Joe Conway) Date: Mon, 20 Aug 2007 06:44:03 -0700 Subject: [Plr-general] protection stack overflow error In-Reply-To: <1187613341.6970.21.camel@PC160> References: <46B36400.1040103@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> <1187613341.6970.21.camel@PC160> Message-ID: <46C99AA3.4050903@joeconway.com> Norbert Billet wrote: > Dear Pl/R users, > > I'm using Pl/R to read a binary file (with the readFormat() function of > the R hexView package) > It's a large file (200 MO) with hundreds of thousands of records that I > will insert on a postgreSQL database with the same Pl/R script. > > But my script crash with this message : > > ERROR: R interpreter expression evaluation error > DETAIL: Error: protect(): protection stack overflow > CONTEXT: In PL/R function remigeimportgaoprofiles > > > Like Pl/R is a script language for postgreSQL, we can't do a commit > periodically. > > > How I can determine the ammount of records that I can import without > overflow the stack ? First -- have you tried running this import script directly in R and if so, what happens then? Second -- please show us the script. It is pretty much impossible to help without more detail, and possibly even a self contained test case to reproduce the issue. Joe From dan.hatfield at gmail.com Tue Aug 21 20:56:23 2007 From: dan.hatfield at gmail.com (Dan Hatfield) Date: Tue, 21 Aug 2007 16:56:23 -0400 Subject: [Plr-general] GPL license Message-ID: <1ca6fbed0708211356h3773e650s885f078e5306fcd0@mail.gmail.com> I see quite a few historical discussions around the GPL license. Was wondering if this has been revisited lately? Apparently, the issue was with R and it's GPL license... People's attitudes have changed to some degree toward BSD license's in recent years...was hoping perhaps that the R language folks might be convinced. :) Would be nice to see PL/R packaged into PostgreSQL. As an alternative, from a commercial standpoint, I've several interpretations of the GPL in this case. Since PL/R is packaged into PostregreSQL and not modified or extended but rather simply used, some have interpreted this as legitimate (and the commercial entity would not need to release their code, i.e. PL/R scripts).. (This would be similar to the BDB/Python arrangement). Anyone have any thoughts on this? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://pgfoundry.org/pipermail/plr-general/attachments/20070821/257cec23/attachment.html From mail at joeconway.com Wed Aug 22 00:08:42 2007 From: mail at joeconway.com (Joe Conway) Date: Tue, 21 Aug 2007 17:08:42 -0700 Subject: [Plr-general] GPL license In-Reply-To: <1ca6fbed0708211356h3773e650s885f078e5306fcd0@mail.gmail.com> References: <1ca6fbed0708211356h3773e650s885f078e5306fcd0@mail.gmail.com> Message-ID: <46CB7E8A.6030803@joeconway.com> Dan Hatfield wrote: > People's attitudes have changed to some degree toward BSD license's in > recent years...was hoping perhaps that the R language folks might be > convinced. :) As far as I know, libR and needed header files are still GPL, not LGPL. > Would be nice to see PL/R packaged into PostgreSQL. Even if PL/R were BSD licensed, I doubt it would be included in the core distribution. The trend has been away from including more stuff in the core distribution, and toward using pgfoundry and similar venues for Postgres related add-ons. > As an alternative, from a commercial standpoint, I've several > interpretations of the GPL in this case. > Since PL/R is packaged into PostregreSQL and not modified or extended > but rather simply used, some have interpreted this as legitimate (and > the commercial entity would not need to release their code, i.e. PL/R > scripts).. > (This would be similar to the BDB/Python arrangement). Not sure about that. But since the R core developers mostly seem pretty firm on their conviction that libR should remain GPL, I'd rather not modify the PL/R license and thereby lose their support. I might rethink that stance if I knew _for_sure_ that it would make a difference in terms of bundling PL/R into the main Postgres distribution -- but like I said, I don't see that happening any time soon. Joe From dan.hatfield at gmail.com Wed Aug 22 18:11:38 2007 From: dan.hatfield at gmail.com (Dan Hatfield) Date: Wed, 22 Aug 2007 14:11:38 -0400 Subject: [Plr-general] GPL license In-Reply-To: <46CB7E8A.6030803@joeconway.com> References: <1ca6fbed0708211356h3773e650s885f078e5306fcd0@mail.gmail.com> <46CB7E8A.6030803@joeconway.com> Message-ID: <1ca6fbed0708221111l12079068l7eb07788abc74d1c@mail.gmail.com> That's unfortunate but entirely understandable. At least using it in a software as a service offering is still a viable commercial alternative. http://www.fsf.org/blogs/licensing/2007-03-29-gplv3-saas http://www.linux-mag.com/id/3017/ On 8/21/07, Joe Conway wrote: > > Dan Hatfield wrote: > > People's attitudes have changed to some degree toward BSD license's in > > recent years...was hoping perhaps that the R language folks might be > > convinced. :) > > As far as I know, libR and needed header files are still GPL, not LGPL. > > > Would be nice to see PL/R packaged into PostgreSQL. > > Even if PL/R were BSD licensed, I doubt it would be included in the core > distribution. The trend has been away from including more stuff in the > core distribution, and toward using pgfoundry and similar venues for > Postgres related add-ons. > > > As an alternative, from a commercial standpoint, I've several > > interpretations of the GPL in this case. > > Since PL/R is packaged into PostregreSQL and not modified or extended > > but rather simply used, some have interpreted this as legitimate (and > > the commercial entity would not need to release their code, i.e. PL/R > > scripts).. > > (This would be similar to the BDB/Python arrangement). > > Not sure about that. > > But since the R core developers mostly seem pretty firm on their > conviction that libR should remain GPL, I'd rather not modify the PL/R > license and thereby lose their support. I might rethink that stance if I > knew _for_sure_ that it would make a difference in terms of bundling > PL/R into the main Postgres distribution -- but like I said, I don't see > that happening any time soon. > > Joe > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://pgfoundry.org/pipermail/plr-general/attachments/20070822/a284adf3/attachment-0001.html From billet at cebc.cnrs.fr Thu Aug 23 13:19:26 2007 From: billet at cebc.cnrs.fr (Norbert Billet) Date: Thu, 23 Aug 2007 15:19:26 +0200 Subject: [Plr-general] protection stack overflow error In-Reply-To: <46C99AA3.4050903@joeconway.com> References: <46B36400.1040103@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> <1187613341.6970.21.camel@PC160> <46C99AA3.4050903@joeconway.com> Message-ID: <1187875166.6135.23.camel@PC160> Le lundi 20 ao?t 2007 ? 06:44 -0700, Joe Conway a ?crit : > Norbert Billet wrote: > > Dear Pl/R users, > > > > I'm using Pl/R to read a binary file (with the readFormat() function of > > the R hexView package) > > It's a large file (200 MO) with hundreds of thousands of records that I > > will insert on a postgreSQL database with the same Pl/R script. > > > > But my script crash with this message : > > > > ERROR: R interpreter expression evaluation error > > DETAIL: Error: protect(): protection stack overflow > > CONTEXT: In PL/R function remigeimportgaoprofiles > > > > > > Like Pl/R is a script language for postgreSQL, we can't do a commit > > periodically. > > > > > > How I can determine the ammount of records that I can import without > > overflow the stack ? > > First -- have you tried running this import script directly in R and if > so, what happens then? > Yes, it'seem to be OK (more than 48 h that my R script is running, but no crash and the inserted lines number is increasing). I use the Bioconductor RdbiPgSQL package for this direct run. > Second -- please show us the script. It is pretty much impossible to > help without more detail, and possibly even a self contained test case > to reproduce the issue. Ok, here a more simple version of my pl/R script, but suffisant to produce the crash : ====================================================================== CREATE OR REPLACE FUNCTION remigeImportGAOTest(a_id_dataset int4, a_data_path text) RETURNS BOOLEAN AS $$ library("hexView") #one index binary format indexes_memformat <- memFormat(start=integer4, stop=integer4) #one observation binary format observation_memformat <- memFormat(profondeur=integer2, temperature=integer2, flagqual_prof=integer2, flagqual_tempe=integer2) #file path indexes_file_name <- paste(a_data_path, "/x2tgn.bin", sep="") observation_file_name <- paste(a_data_path, "/TOGAN2.BIN", sep="") #files list year_index_file_name_list <- list.files(path=a_data_path, pattern="X2T*", full.names=TRUE) #loading station index station_index_table <- read.fwf(paste(a_data_path, "/TOGAP2.DIR", sep=""), c(1,7,3,2,1,8,4,1,1,4,2,2,2,2,3,2,4,2,1,6,1,1,1,1,3,5,1,4,6), as.is=c(FALSE)) #human readable names names(station_index_table) <- c("carte", "profile_key", "instit_code", "code_pays", "ocean", "code_bateau", "cruise", "plateforme", "bl1", "an", "mois", "jour", "heure", "minute", "latdeg", "latmin", "londeg", "lonmin", "typro", "update", "validation", "qual1", "bidon1", "bidon2", "thermocline", "salinite", "qual2", "maxdepth", "cle") #read all the files for(current_year_index_file_name in year_index_file_name_list) { current_year <- as.numeric(substr(current_year_index_file_name, nchar(current_year_index_file_name) - 7, nchar(current_year_index_file_name) - 4)) #reading the monthly index current_year_table <- read.table(current_year_index_file_name) #reading the 12 lines for(current_month in 1:12) { month_index_start <- current_year_table[current_month, 1] month_index_end <- current_year_table[current_month, 2] #there was monthly data ? if(month_index_start != 999999 && month_index_end != 999999) { #yes ! we read with an sequential index for(current_header_record in month_index_start:month_index_end) { my_hydrological_profile_id <- pg.spi.exec("SELECT nextval('hydrological_profile_id_seq');") #inserting the hydrological profile data sql_query <- paste("INSERT INTO hydrological_profile(id, id_dataset) VALUES(", my_hydrological_profile_id$nextval, ", ", a_id_dataset, ");", sep="") pg.spi.exec(sql_query) #reading current record current_indexes <- readFormat(file=indexes_file_name, format=indexes_memformat, offset=(station_index_table[current_header_record, "cle"]-1)*8) for(observation_index in current_indexes$blocks$start $fileNum:current_indexes$blocks$stop$fileNum) { #reading one observation i <- 0 current_observation <- readFormat(file=observation_file_name, format=observation_memformat, offset=(observation_index-1)*100+4+(i*8)) #while temperature and depth are != 0 while(i < 12 && (current_observation$blocks$profondeur$fileNum != 0 || current_observation$blocks$temperature$fileNum != 0)) { sql_query <- paste("INSERT INTO discrete_variable(id_hydrological_profile, kind) VALUES(", my_hydrological_profile_id$nextval, ", 't');", sep="") pg.spi.exec(sql_query) i <- i + 1 current_observation <- readFormat(file=observation_file_name, format=observation_memformat, offset=(observation_index-1)*100+4+(i*8)) } } } } } } return(TRUE) $$ LANGUAGE 'plr'; --SELECT remigeImportGAOTest(7, '/data/GAO/DATA/HYDRO'); ====================================================================== Thank for your help Joe. Norbert ________ Information from NOD32 ________ This message was checked by NOD32 Antivirus System for Linux Mail Servers. http://www.eset.com From mail at joeconway.com Thu Aug 23 13:56:00 2007 From: mail at joeconway.com (Joe Conway) Date: Thu, 23 Aug 2007 06:56:00 -0700 Subject: [Plr-general] protection stack overflow error In-Reply-To: <1187875166.6135.23.camel@PC160> References: <46B36400.1040103@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> <1187613341.6970.21.camel@PC160> <46C99AA3.4050903@joeconway.com> <1187875166.6135.23.camel@PC160> Message-ID: <46CD91F0.3070900@joeconway.com> Norbert Billet wrote: > > Ok, here a more simple version of my pl/R script, but suffisant to > produce the crash : > my_hydrological_profile_id$nextval, ", 't');", sep="") > pg.spi.exec(sql_query) I should have asked this earlier -- what version of PL/R are you using? I fixed a bug related to protection in the spi code not too long ago. See the release notes (http://www.joeconway.com/plr/): - add missing UNPROTECT(1) in spi code -- fix for "stack imbalance" warning Given hundreds of thousands of calls to spi to do the inserts combined with this bug may be your problem. That said, doing the inserts one at a time like this is likely to be very inefficient. You might want to consider using COPY in some form or another to bulk load your data. Joe From billet at cebc.cnrs.fr Thu Aug 23 14:49:51 2007 From: billet at cebc.cnrs.fr (Norbert Billet) Date: Thu, 23 Aug 2007 16:49:51 +0200 Subject: [Plr-general] protection stack overflow error In-Reply-To: <46CD91F0.3070900@joeconway.com> References: <46B36400.1040103@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> <1187613341.6970.21.camel@PC160> <46C99AA3.4050903@joeconway.com> <1187875166.6135.23.camel@PC160> <46CD91F0.3070900@joeconway.com> Message-ID: <1187880591.6135.42.camel@PC160> Le jeudi 23 ao?t 2007 ? 06:56 -0700, Joe Conway a ?crit : > Norbert Billet wrote: > > > > Ok, here a more simple version of my pl/R script, but suffisant to > > produce the crash : > > > my_hydrological_profile_id$nextval, ", 't');", sep="") > > pg.spi.exec(sql_query) > > I should have asked this earlier -- what version of PL/R are you using? > I'm using plr-0.6.2.2-alpha with PostgreSQL 8.1.8 on i686-redhat-linux-gnu, compiled by gcc 3.4.6 20060404 (powered by Red Hat 3.4.6-3) > I fixed a bug related to protection in the spi code not too long ago. > See the release notes (http://www.joeconway.com/plr/): > - add missing UNPROTECT(1) in spi code -- fix for "stack imbalance" > warning > Sorry, I ask for a fixed bug. I'm doing a new installation to test. > Given hundreds of thousands of calls to spi to do the inserts combined > with this bug may be your problem. > > That said, doing the inserts one at a time like this is likely to be > very inefficient. You might want to consider using COPY in some form or > another to bulk load your data. > Of course, but in my case COPY is not applicable. And for each information treaments and correction take more time that the database insertion. > Joe One more time : thank for your great job Norbert ________ Information from NOD32 ________ This message was checked by NOD32 Antivirus System for Linux Mail Servers. http://www.eset.com From mail at joeconway.com Thu Aug 23 15:30:10 2007 From: mail at joeconway.com (Joe Conway) Date: Thu, 23 Aug 2007 08:30:10 -0700 Subject: [Plr-general] protection stack overflow error In-Reply-To: <1187880591.6135.42.camel@PC160> References: <46B36400.1040103@joeconway.com> <60BD7AC787479D47A3D13E42D836101D024F4166@icex4.ic.ac.uk> <1187613341.6970.21.camel@PC160> <46C99AA3.4050903@joeconway.com> <1187875166.6135.23.camel@PC160> <46CD91F0.3070900@joeconway.com> <1187880591.6135.42.camel@PC160> Message-ID: <46CDA802.1070607@joeconway.com> Norbert Billet wrote: > Le jeudi 23 ao?t 2007 ? 06:56 -0700, Joe Conway a ?crit : >> Norbert Billet wrote: >>> Ok, here a more simple version of my pl/R script, but suffisant to >>> produce the crash : >>> my_hydrological_profile_id$nextval, ", 't');", sep="") >>> pg.spi.exec(sql_query) >> I should have asked this earlier -- what version of PL/R are you using? > > I'm using plr-0.6.2.2-alpha with PostgreSQL 8.1.8 on > i686-redhat-linux-gnu, compiled by gcc 3.4.6 20060404 (powered by Red > Hat 3.4.6-3) > >> I fixed a bug related to protection in the spi code not too long ago. >> See the release notes (http://www.joeconway.com/plr/): > >> - add missing UNPROTECT(1) in spi code -- fix for "stack imbalance" >> warning > > Sorry, I ask for a fixed bug. I'm doing a new installation to test. OK, please let me know how it goes. Based on your version number I think this fixed bug is most probably the issue. >> That said, doing the inserts one at a time like this is likely to be >> very inefficient. You might want to consider using COPY in some form or >> another to bulk load your data. > > Of course, but in my case COPY is not applicable. And for each > information treaments and correction take more time that the database > insertion. Oh, OK. Just wanted to check. > One more time : thank for your great job You're welcome. I hope PL/R will serve you well :-) Joe From ssinger_pg at sympatico.ca Sat Aug 25 18:52:29 2007 From: ssinger_pg at sympatico.ca (Steve Singer) Date: Sat, 25 Aug 2007 14:52:29 -0400 (EDT) Subject: [Plr-general] plr patch for cursor support Message-ID: Attached is an initial patch to add cursor support: spi.cursor_[open,fetch,close] to plr. This is a patch against plr-8.3.0.3-beta.tar.gz. I've done some some testing against a recent pgsql head/8.3 snapshot. I had to remove the DLLIMPORT attribute from pg_backend_support.c (not included in this patch) to get it to compile. I am also having some issues with using the cursor in backward mode but I haven't yet determined if its an issue with my patch or if I'd get the same behavior in a straight C/SPI function. If there is interest in applying the patch I can send off an update for the docs. If people have better ways of processing query results larger than memory in plr I'm open to pointers. Steve -------------- next part -------------- diff -u plr/pg_rsupport.c plr/pg_rsupport.c --- plr/pg_rsupport.c 2007-06-12 17:32:21.000000000 -0400 +++ plr/pg_rsupport.c 2007-08-25 14:12:42.763014975 -0400 @@ -604,6 +607,226 @@ return result; } +/** + * Takes the prepared plan rsaved_plan and creates a cursor + * for it using the values specified in ragvalues. + * + */ +SEXP +plr_SPI_cursor_open(SEXP cursor_name_arg,SEXP rsaved_plan, SEXP rargvalues) { +saved_plan_desc *plan_desc = (saved_plan_desc *) R_ExternalPtrAddr(rsaved_plan); + void *saved_plan = plan_desc->saved_plan; + int nargs = plan_desc->nargs; + Oid *typelems = plan_desc->typelems; + FmgrInfo *typinfuncs = plan_desc->typinfuncs; + int i; + Datum *argvalues = NULL; + char *nulls = NULL; + bool isnull = false; + SEXP obj; + SEXP result = NULL; + MemoryContext oldcontext; + char cursor_name[64]; + Portal portal=NULL; + PREPARE_PG_TRY; + /** + * 1. Divide rargvalues + */ + /* set up error context */ + PUSH_PLERRCONTEXT(rsupport_error_callback, "pg.spi.cursor_open"); + if (nargs > 0) + { + if (!IS_LIST(rargvalues)) + error("%s", "second parameter must be a list of arguments " \ + "to the prepared plan"); + + if (length(rargvalues) != nargs) + error("list of arguments (%d) is not the same length " \ + "as that of the prepared plan (%d)", + length(rargvalues), nargs); + + argvalues = (Datum *) palloc(nargs * sizeof(Datum)); + nulls = (char *) palloc(nargs * sizeof(char)); + } + + for (i = 0; i < nargs; i++) + { + PROTECT(obj = VECTOR_ELT(rargvalues, i)); + + argvalues[i] = get_scalar_datum(obj, typinfuncs[i], typelems[i], &isnull); + if (!isnull) + nulls[i] = ' '; + else + nulls[i] = 'n'; + + UNPROTECT(1); + } + strncpy(cursor_name,CHAR(STRING_ELT(cursor_name_arg,0)),64); + + /* switch to SPI memory context */ + oldcontext = MemoryContextSwitchTo(plr_SPI_context); + + /* + * trap elog/ereport so we can let R finish up gracefully + * and generate the error once we exit the interpreter + */ + PG_TRY(); + { + /* Open the cursor */ + portal = SPI_cursor_open(cursor_name,saved_plan, argvalues, nulls,1); + + } + PLR_PG_CATCH(); + PLR_PG_END_TRY(); + + /* back to caller's memory context */ + MemoryContextSwitchTo(oldcontext); + if(portal==NULL) + { + /*Err or*/ + error("SPI_cursor_open() failed: "); + } + else + { + result = R_MakeExternalPtr(portal, R_NilValue, R_NilValue); + } + POP_PLERRCONTEXT; + return result; +} + + +SEXP plr_SPI_cursor_fetch(SEXP cursor_in,SEXP forward_in, SEXP rows_in) +{ + + Portal portal=NULL; + int ntuples; + SEXP result = NULL; + MemoryContext oldcontext; + int forward; + int rows; + PREPARE_PG_TRY; + PUSH_PLERRCONTEXT(rsupport_error_callback, "pg.spi.cursor_fetch"); + + + + portal = R_ExternalPtrAddr(cursor_in); + if(!IS_LOGICAL(forward_in)) + { + error("pg.spi.cursor_fetch arg2 must be boolean"); + return result; + } + if(!IS_INTEGER(rows_in)) + { + error("pg.spi.cursor_fetch arg3 must be an integer"); + return result; + } + forward = LOGICAL_DATA(forward_in)[0]; + rows = INTEGER_DATA(rows_in)[0]; + + + /* switch to SPI memory context */ + oldcontext = MemoryContextSwitchTo(plr_SPI_context); + PG_TRY(); + { + /* Open the cursor */ + SPI_cursor_fetch(portal,forward,rows); + + } + PLR_PG_CATCH(); + PLR_PG_END_TRY(); + /* back to caller's memory context */ + MemoryContextSwitchTo(oldcontext); + + /* che ck the result */ + + ntuples = SPI_processed; + if (ntuples > 0) + { + result = rpgsql_get_results(ntuples, SPI_tuptable); + SPI_freetuptable(SPI_tuptable); + } + else + result = R_NilValue; + + + + POP_PLERRCONTEXT; + return result; + +} + +void plr_SPI_cursor_close(SEXP cursor_in) +{ + Portal portal=NULL; + MemoryContext oldcontext; + PREPARE_PG_TRY; + PUSH_PLERRCONTEXT(rsupport_error_callback, "pg.spi.cursor_close"); + + + portal = R_ExternalPtrAddr(cursor_in); + + /* swi tch to SPI memory context */ + oldcontext = MemoryContextSwitchTo(plr_SPI_context); + PG_TRY(); + { + /* Op en the cursor */ + SPI_cursor_close(portal); + + } + PLR_PG_CATCH(); + PLR_PG_END_TRY(); + /* back to caller's memory context */ + MemoryContextSwitchTo(oldcontext); + + + +} + + + +void plr_SPI_cursor_move(SEXP cursor_in,SEXP forward_in, SEXP rows_in) +{ + + Portal portal=NULL; + MemoryContext oldcontext; + int forward; + int rows; + PREPARE_PG_TRY; + PUSH_PLERRCONTEXT(rsupport_error_callback, "pg.spi.cursor_move"); + + + + portal = R_ExternalPtrAddr(cursor_in); + if(!IS_LOGICAL(forward_in)) + { + error("pg.spi.cu rsor_move arg2 must be boolean"); + return; + } + if(!IS_INTEGER(rows_in)) + { + error("pg.s pi.cursor_move arg3 must be an integer"); + return; + } + forward = LOGICAL(forward_in)[0]; + rows = INTEGER(rows_in)[0]; + + + /* swi tch to SPI memory context */ + oldcontext = MemoryContextSwitchTo(plr_SPI_context); + PG_TRY(); + { + /* Op en the cursor */ + SPI_cursor_move(portal,forward,rows); + + } + PLR_PG_CATCH(); + PLR_PG_END_TRY(); + /* back to caller's memory context */ + MemoryContextSwitchTo(oldcontext); + + +} + void throw_r_error(const char **msg) { diff -u plr/plr.c plr/plr.c --- plr/plr.c 2007-06-24 10:51:30.000000000 -0400 +++ plr/plr.c 2007-08-25 00:40:17.898503802 -0400 @@ -90,6 +90,18 @@ #define SPI_EXECP_CMD \ "pg.spi.execp <-function(sql, argvalues = NA) " \ "{.Call(\"plr_SPI_execp\", sql, argvalues)}" +#define SPI_CURSOR_OPEN_CMD \ + "pg.spi.cursor_open<-function(cursor_name,plan,argvalues=NA) " \ + "{.Call(\"plr_SPI_cursor_open\",cursor_name,plan,argvalues)}" +#define SPI_CURSOR_FETCH_CMD \ + "pg.spi.cursor_fetch<-function(cursor,forward,rows) " \ + "{.Call(\"plr_SPI_cursor_fetch\",cursor,forward,rows)}" +#define SPI_CURSOR_MOVE_CMD \ + "pg.spi.cursor_move<-function(cursor,forward,rows) " \ + "{.Call(\"plr_SPI_cursor_move\",cursor,forward,rows)}" +#define SPI_CURSOR_CLOSE_CMD \ + "pg.spi.cursor_close<-function(cursor) " \ + "{.Call(\"plr_SPI_cursor_close\",cursor)}" #define SPI_LASTOID_CMD \ "pg.spi.lastoid <-function() " \ "{.Call(\"plr_SPI_lastoid\")}" @@ -356,6 +368,10 @@ SPI_EXEC_CMD, SPI_PREPARE_CMD, SPI_EXECP_CMD, + SPI_CURSOR_OPEN_CMD, + SPI_CURSOR_FETCH_CMD, + SPI_CURSOR_MOVE_CMD, + SPI_CURSOR_CLOSE_CMD, SPI_LASTOID_CMD, SPI_FACTOR_CMD, diff -u plr/plr.h plr/plr.h --- plr/plr.h 2007-06-12 17:32:21.000000000 -0400 +++ plr/plr.h 2007-08-25 14:12:32.549014975 -0400 @@ -449,6 +449,11 @@ extern SEXP plr_SPI_exec(SEXP rsql); extern SEXP plr_SPI_prepare(SEXP rsql, SEXP rargtypes); extern SEXP plr_SPI_execp(SEXP rsaved_plan, SEXP rargvalues); +extern SEXP plr_SPI_cursor_open(SEXP cursor_name_arg,SEXP rsaved_plan, SEXP rargvalues); +extern SEXP plr_SPI_cursor_fetch(SEXP cursor_in,SEXP forward_in, SEXP rows_in); +extern void plr_SPI_cursor_close(SEXP cursor_in); +extern void plr_SPI_cursor_move(SEXP cursor_in, SEXP forward_in, SEXP rows_in); + extern SEXP plr_SPI_lastoid(void); extern void throw_r_error(const char **msg); From mail at joeconway.com Sun Aug 26 03:40:39 2007 From: mail at joeconway.com (Joe Conway) Date: Sat, 25 Aug 2007 20:40:39 -0700 Subject: [Plr-general] plr patch for cursor support In-Reply-To: References: Message-ID: <46D0F637.60106@joeconway.com> Steve Singer wrote: > > Attached is an initial patch to add cursor support: > spi.cursor_[open,fetch,close] > to plr. > > This is a patch against plr-8.3.0.3-beta.tar.gz. I've done some some > testing against a recent pgsql head/8.3 snapshot. I had to remove the > DLLIMPORT attribute from pg_backend_support.c (not included in this > patch) to get it to compile. Thanks. I'll have a closer look, but probably not for a week or two. Joe From ssinger_pg at sympatico.ca Sun Aug 26 17:45:56 2007 From: ssinger_pg at sympatico.ca (Steve Singer) Date: Sun, 26 Aug 2007 13:45:56 -0400 (EDT) Subject: [Plr-general] memory leak in POP_PLRCONTEXT Message-ID: Joe, When do get around to looking at plr patches, here is another one. PUSH_PLRCONTEXT makes a call to pstrdup but POP_PLRCONTEXT doesn't have a cooresponding call to pfree. If the memory context is long living (ie if your plr function makes lots of spi calls) memory usage can start to grow. The attached patch adds a call to pfree() in POP_PLRCONTEXT. -------------- next part -------------- --- ../plr_patched/plr.h 2007-08-25 14:37:21.878217889 -0400 +++ plr.h 2007-08-26 12:52:02.959541048 -0400 @@ -218,6 +218,7 @@ #define POP_PLERRCONTEXT \ do { \ error_context_stack = plerrcontext.previous; \ + pfree(plerrcontext.arg); \ } while (0) #define SAVE_PLERRCONTEXT \ From nconway at truviso.com Wed Aug 29 22:36:03 2007 From: nconway at truviso.com (Neil Conway) Date: Wed, 29 Aug 2007 15:36:03 -0700 Subject: [Plr-general] Patches: version check, resetStringInfo Message-ID: <1188426963.32003.43.camel@dell.linuxdev.us.dell.com> Attached are three trivial patches: (1) The first two patches change PL/R to use PG_VERSION_NUM to get the version of the Postgres headers we're compiling against, instead of CATALOG_VERSION_NO. At Truviso, the catversion check prevents PL/R from compiling out of the box: we ship what is essentially PostgreSQL 8.2.4 with additional functionality and a different catversion. Since PG_VERSION_NUM is available in 8.2+, it should be safe to use that instead of the catversion for the PG 8.2 and 8.3devel versions of PL/R. (2) The last patch removes resetStringInfo(), which is unused. CVS HEAD contains a different implementation of resetStringInfo() anyway. -Neil, wearing his Truviso hat -------------- next part -------------- A non-text attachment was scrubbed... Name: plr_82_version_check.patch Type: text/x-patch Size: 578 bytes Desc: not available Url : http://pgfoundry.org/pipermail/plr-general/attachments/20070829/9f469feb/attachment.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: plr_83_version_check.patch Type: text/x-patch Size: 576 bytes Desc: not available Url : http://pgfoundry.org/pipermail/plr-general/attachments/20070829/9f469feb/attachment-0001.bin -------------- next part -------------- A non-text attachment was scrubbed... Name: plr_83_reset_string_info.patch Type: text/x-patch Size: 347 bytes Desc: not available Url : http://pgfoundry.org/pipermail/plr-general/attachments/20070829/9f469feb/attachment-0002.bin From josh at gghc.com Fri Aug 31 20:10:00 2007 From: josh at gghc.com (Joshua Reich) Date: Fri, 31 Aug 2007 16:10:00 -0400 Subject: [Plr-general] lapack routines cannot be loaded in PL/R, but work in R Message-ID: Hi, The following code snippet is taken from help(chol2inv) and wrapped in PL/R: CREATE OR REPLACE FUNCTION alapacka () RETURNS float AS ' cma <- chol(ma <- cbind(1, 1:3, c(1,3,7))) ma %*% chol2inv(cma) 1 ' LANGUAGE 'plr'; When I run it: skunk=# select * from alapacka(); ERROR: R interpreter expression evaluation error DETAIL: Error in chol(ma <- cbind(1, 1:3, c(1, 3, 7))) : lapack routines cannot be loaded CONTEXT: In PL/R function alapacka However, in an R session on the same box, everything works as expected: > cma <- chol(ma <- cbind(1, 1:3, c(1,3,7))) > ma %*% chol2inv(cma) [,1] [,2] [,3] [1,] 1.000000e+00 0.000000e+00 0 [2,] 1.110223e-16 1.000000e+00 0 [3,] -1.110223e-16 4.440892e-16 1 My machine is running Debian, with 2.6.18-4 linux and R comes from the r-base package: R version 2.4.0 Patched (2006-11-25 r39997) Any & all hints as to how to resolve this would be greatly appreciated. Thanks, Josh From josh at gghc.com Fri Aug 31 20:44:53 2007 From: josh at gghc.com (Joshua Reich) Date: Fri, 31 Aug 2007 16:44:53 -0400 Subject: [Plr-general] lapack routines cannot be loaded in PL/R, but work in R In-Reply-To: References: Message-ID: To answer my own question, /usr/lib/R was missing from my /etc/ld.so.conf Josh -----Original Message----- From: Joshua Reich Sent: Friday, August 31, 2007 4:10 PM To: 'plr-general at pgfoundry.org' Subject: lapack routines cannot be loaded in PL/R, but work in R Hi, The following code snippet is taken from help(chol2inv) and wrapped in PL/R: CREATE OR REPLACE FUNCTION alapacka () RETURNS float AS ' cma <- chol(ma <- cbind(1, 1:3, c(1,3,7))) ma %*% chol2inv(cma) 1 ' LANGUAGE 'plr'; When I run it: skunk=# select * from alapacka(); ERROR: R interpreter expression evaluation error DETAIL: Error in chol(ma <- cbind(1, 1:3, c(1, 3, 7))) : lapack routines cannot be loaded CONTEXT: In PL/R function alapacka However, in an R session on the same box, everything works as expected: > cma <- chol(ma <- cbind(1, 1:3, c(1,3,7))) > ma %*% chol2inv(cma) [,1] [,2] [,3] [1,] 1.000000e+00 0.000000e+00 0 [2,] 1.110223e-16 1.000000e+00 0 [3,] -1.110223e-16 4.440892e-16 1 My machine is running Debian, with 2.6.18-4 linux and R comes from the r-base package: R version 2.4.0 Patched (2006-11-25 r39997) Any & all hints as to how to resolve this would be greatly appreciated. Thanks, Josh