'Re: [HACKERS] proposal: row_to_array function'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       pgsql-hackers
Subject:    Re: [HACKERS] proposal: row_to_array function
From:       Pavel Stehule <pavel.stehule () gmail ! com>
Date:       2015-03-31 12:50:17
Message-ID: CAFj8pRD7ybQ=yrWhDsUyaYH4C1ycHPoyB69bK-5s_93UEZvZ+Q () mail ! gmail ! com
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]


2015-03-29 21:20 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

>
>
> 2015-03-29 20:27 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
>
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> > here is rebased patch.
>> > It contains both patches - row_to_array function and foreach array
>> support.
>>
>> While I don't have a problem with hstore_to_array, I don't think that
>> row_to_array is a very good idea; it's basically encouraging people to
>> throw away SQL datatypes altogether and imagine that everything is text.
>>
>
> This is complementation of ARRAY API - we have row_to_json, probably will
> have row_to_jsonb, row_to_hstore and "row_to_array" is relative logical.
> Casting to text is not fast, but on second hand - working with text arrays
> is fast.
>
> I know so casting to text is a problem, but if you iterate over record's
> fields, then you have to find common shared type due sharing plans - and
> text arrays can be simple solution.
>
> Now, with current possibilities I'll do full sql expression SELECT key,
> value FROM each(hstore(ROW)) or FOREACH ARRAY hstore_to_matrix(hstore(ROW))
>
> row_to_array(ROW) can reduce a hstore overhead
>
> any other solution based on PL/Perl or PL/Python are slower due PL engine
> start and due same transformation to some form of structured text.
>
>
>
>
>> They've already bought into that concept if they are using hstore or
>> json, so smashing elements of those containers to text is not a problem.
>> But that doesn't make this version a good thing.
>>
>> (In any case, those who insist can get there through row_to_json, no?)
>>
>> Also, could we please *not* mix up these two very independent features?
>> "foreach array" as implemented here may or may not be a good thing, but
>> it should get its own discussion.
>>
>
> ok, I'll send two patches.
>

attachments contains previous patch separated to two independent patches.

Regards

Pavel



>
>
>>
>>                         regards, tom lane
>>
>
>

[Attachment #5 (text/html)]

<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2015-03-29 \
21:20 GMT+02:00 Pavel Stehule <span dir="ltr">&lt;<a \
href="mailto:pavel.stehule@gmail.com" \
target="_blank">pavel.stehule@gmail.com</a>&gt;</span>:<br><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div \
class="gmail_quote"><span class="">2015-03-29 20:27 GMT+02:00 Tom Lane <span \
dir="ltr">&lt;<a href="mailto:tgl@sss.pgh.pa.us" \
target="_blank">tgl@sss.pgh.pa.us</a>&gt;</span>:<br><blockquote class="gmail_quote" \
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>Pavel \
Stehule &lt;<a href="mailto:pavel.stehule@gmail.com" \
target="_blank">pavel.stehule@gmail.com</a>&gt; writes:<br> &gt; here is rebased \
patch.<br> &gt; It contains both patches - row_to_array function and foreach array \
support.<br> <br>
</span>While I don&#39;t have a problem with hstore_to_array, I don&#39;t think \
that<br> row_to_array is a very good idea; it&#39;s basically encouraging people \
to<br> throw away SQL datatypes altogether and imagine that everything is \
text.<br></blockquote><div><br></div></span><div>This is complementation of ARRAY API \
- we have row_to_json, probably will have row_to_jsonb, row_to_hstore and \
&quot;row_to_array&quot; is relative logical.   Casting to text is not fast, but on \
second hand - working with text arrays is fast.<br><br></div><div>I know so casting \
to text is a problem, but if you iterate over record&#39;s fields, then you have to \
find common shared type due sharing plans - and text arrays can be simple solution. \
<br><br></div><div>Now, with current possibilities I&#39;ll do full sql expression \
SELECT key, value FROM each(hstore(ROW)) or FOREACH ARRAY \
hstore_to_matrix(hstore(ROW))<br><br></div><div>row_to_array(ROW) can reduce a hstore \
overhead <br><br></div><div>any other solution based on PL/Perl or PL/Python are \
slower due PL engine start and due same transformation to some form of structured \
text. <br></div><span class=""><div><br><br></div><div>  </div><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"> They&#39;ve already bought into that concept if they are \
using hstore or<br> json, so smashing elements of those containers to text is not a \
problem.<br> But that doesn&#39;t make this version a good thing.<br>
<br>
(In any case, those who insist can get there through row_to_json, no?)<br>
<br>
Also, could we please *not* mix up these two very independent features?<br>
&quot;foreach array&quot; as implemented here may or may not be a good thing, but<br>
it should get its own discussion.<br></blockquote><div><br></div></span><div>ok, \
I&#39;ll send two patches.<br></div></div></div></div></blockquote><div><br></div><div>attachments \
contains previous patch separated to two independent \
patches.<br><br></div><div>Regards<br><br></div><div>Pavel<br></div><div><br>  \
<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px \
#ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div \
class="gmail_quote"><div></div><div>  </div><blockquote class="gmail_quote" \
style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <br>
                                    regards, tom lane<br>
</blockquote></div><br></div></div>
</blockquote></div><br></div></div>


["row_to_array-20150331-01.patch" (text/x-patch)]

commit 0b432fd3a42132ddddd287c4395b13f8a25ab294
Author: Pavel Stehule <pavel.stehule@gooddata.com>
Date:   Tue Mar 31 14:43:27 2015 +0200

    row_to_array

diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c
index a65e18d..1a64d8e 100644
--- a/src/backend/utils/adt/rowtypes.c
+++ b/src/backend/utils/adt/rowtypes.c
@@ -21,6 +21,7 @@
 #include "catalog/pg_type.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
+#include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/typcache.h"
@@ -1810,3 +1811,90 @@ btrecordimagecmp(PG_FUNCTION_ARGS)
 {
 	PG_RETURN_INT32(record_image_cmp(fcinfo));
 }
+
+/*
+ * transform any record to array in format [key1, value1, key2, value2 [, ...]]
+ *
+ * This format is compatible with hstore_to_array function
+ */
+Datum
+row_to_array(PG_FUNCTION_ARGS)
+{
+	HeapTupleHeader		rec = PG_GETARG_HEAPTUPLEHEADER(0);
+	TupleDesc		rectupdesc;
+	Oid			rectuptyp;
+	int32			rectuptypmod;
+	HeapTupleData		rectuple;
+	int	ncolumns;
+	Datum 		*recvalues;
+	bool  		*recnulls;
+	ArrayBuildState		*builder;
+	int	i;
+
+	/* Extract type info from the tuple itself */
+	rectuptyp = HeapTupleHeaderGetTypeId(rec);
+	rectuptypmod = HeapTupleHeaderGetTypMod(rec);
+	rectupdesc = lookup_rowtype_tupdesc(rectuptyp, rectuptypmod);
+	ncolumns = rectupdesc->natts;
+
+	/* Build a temporary HeapTuple control structure */
+	rectuple.t_len = HeapTupleHeaderGetDatumLength(rec);
+	ItemPointerSetInvalid(&(rectuple.t_self));
+	rectuple.t_tableOid = InvalidOid;
+	rectuple.t_data = rec;
+
+	recvalues = (Datum *) palloc(ncolumns * sizeof(Datum));
+	recnulls = (bool *) palloc(ncolumns * sizeof(bool));
+
+	/* Break down the tuple into fields */
+	heap_deform_tuple(&rectuple, rectupdesc, recvalues, recnulls);
+
+	/* Prepare target array */
+	builder = initArrayResult(TEXTOID, CurrentMemoryContext, true);
+
+	for (i = 0; i < ncolumns; i++)
+	{
+		Oid	columntyp = rectupdesc->attrs[i]->atttypid;
+		Datum		value;
+		bool		isnull;
+
+		/* Ignore dropped columns */
+		if (rectupdesc->attrs[i]->attisdropped)
+			continue;
+
+		builder = accumArrayResult(builder,
+							CStringGetTextDatum(NameStr(rectupdesc->attrs[i]->attname)),
+							false,
+							TEXTOID,
+							CurrentMemoryContext);
+
+		if (!recnulls[i])
+		{
+			char *outstr;
+			bool		typIsVarlena;
+			Oid		typoutput;
+			FmgrInfo		proc;
+
+			getTypeOutputInfo(columntyp, &typoutput, &typIsVarlena);
+			fmgr_info_cxt(typoutput, &proc, CurrentMemoryContext);
+			outstr = OutputFunctionCall(&proc, recvalues[i]);
+
+			value = CStringGetTextDatum(outstr);
+			isnull = false;
+		}
+		else
+		{
+			value = (Datum) 0;
+			isnull = true;
+		}
+
+		builder = accumArrayResult(builder,
+						    value, isnull,
+						    TEXTOID,
+						    CurrentMemoryContext);
+	}
+
+	ReleaseTupleDesc(rectupdesc);
+
+	PG_RETURN_DATUM(makeArrayResult(builder, CurrentMemoryContext));
+}
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index a96d369..1b4c578 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -891,6 +891,8 @@ DATA(insert OID = 376 (  string_to_array   PGNSP PGUID 12 1 0 0 0 \
f f f f f f i  DESCR("split delimited text into text[], with null string");
 DATA(insert OID = 384 (  array_to_string   PGNSP PGUID 12 1 0 0 0 f f f f f f s 3 0 \
25 "2277 25 25" _null_ _null_ _null_ _null_ array_to_text_null _null_ _null_ _null_ \
));  DESCR("concatenate array elements, using delimiter and null string, into text");
+DATA(insert OID = 4057 (  row_to_array   PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 \
1009 "2249" _null_ _null_ _null_ _null_ row_to_array _null_ _null_ _null_ )); \
+DESCR("transform any record to text[]");  DATA(insert OID = 515 (  array_larger	   \
PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 2277 "2277 2277" _null_ _null_ _null_ _null_ \
array_larger _null_ _null_ _null_ ));  DESCR("larger of two");
 DATA(insert OID = 516 (  array_smaller	   PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 \
                2277 "2277 2277" _null_ _null_ _null_ _null_ array_smaller _null_ \
                _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 6310641..7aabfe1 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -668,6 +668,7 @@ extern Datum record_image_gt(PG_FUNCTION_ARGS);
 extern Datum record_image_le(PG_FUNCTION_ARGS);
 extern Datum record_image_ge(PG_FUNCTION_ARGS);
 extern Datum btrecordimagecmp(PG_FUNCTION_ARGS);
+extern Datum row_to_array(PG_FUNCTION_ARGS);
 
 /* ruleutils.c */
 extern bool quote_all_identifiers;
diff --git a/src/test/regress/expected/rowtypes.out \
b/src/test/regress/expected/rowtypes.out index 54525de..0e249be 100644
--- a/src/test/regress/expected/rowtypes.out
+++ b/src/test/regress/expected/rowtypes.out
@@ -634,3 +634,11 @@ select row_to_json(r) from (select q2,q1 from tt1 offset 0) r;
  {"q2":0,"q1":0}
 (3 rows)
 
+select row_to_array(r) from (select q2,q1 from tt1 offset 0) r;
+         row_to_array         
+------------------------------
+ {q2,456,q1,123}
+ {q2,4567890123456789,q1,123}
+ {q2,0,q1,0}
+(3 rows)
+
diff --git a/src/test/regress/sql/rowtypes.sql b/src/test/regress/sql/rowtypes.sql
index bc3f021..3450417 100644
--- a/src/test/regress/sql/rowtypes.sql
+++ b/src/test/regress/sql/rowtypes.sql
@@ -271,3 +271,4 @@ create temp table tt1 as select * from int8_tbl limit 2;
 create temp table tt2 () inherits(tt1);
 insert into tt2 values(0,0);
 select row_to_json(r) from (select q2,q1 from tt1 offset 0) r;
+select row_to_array(r) from (select q2,q1 from tt1 offset 0) r;


["plpgsql-multiassign-foreach-20150331-01.patch" (text/x-patch)]

commit bd8ef3d652f3d4bc44012c6db3a018e2d14cd85f
Author: Pavel Stehule <pavel.stehule@gooddata.com>
Date:   Tue Mar 31 14:46:54 2015 +0200

    plpgsql - multiassing foreach

diff --git a/contrib/hstore/expected/hstore.out b/contrib/hstore/expected/hstore.out
index 9749e45..e44532e 100644
--- a/contrib/hstore/expected/hstore.out
+++ b/contrib/hstore/expected/hstore.out
@@ -1148,6 +1148,22 @@ select %% 'aa=>1, cq=>l, b=>g, fg=>NULL';
  {b,g,aa,1,cq,l,fg,NULL}
 (1 row)
 
+-- fast iteration over keys
+do $$
+declare
+  key text;
+  value text;
+begin
+  foreach key, value in array hstore_to_array('aa=>1, cq=>l, b=>g, fg=>NULL'::hstore)
+  loop
+    raise notice 'key: %, value: %', key, value;
+  end loop;
+end;
+$$;
+NOTICE:  key: b, value: g
+NOTICE:  key: aa, value: 1
+NOTICE:  key: cq, value: l
+NOTICE:  key: fg, value: <NULL>
 select hstore_to_matrix('aa=>1, cq=>l, b=>g, fg=>NULL'::hstore);
         hstore_to_matrix         
 ---------------------------------
diff --git a/contrib/hstore/sql/hstore.sql b/contrib/hstore/sql/hstore.sql
index 5a9e9ee..7b9eb09 100644
--- a/contrib/hstore/sql/hstore.sql
+++ b/contrib/hstore/sql/hstore.sql
@@ -257,6 +257,19 @@ select avals('');
 select hstore_to_array('aa=>1, cq=>l, b=>g, fg=>NULL'::hstore);
 select %% 'aa=>1, cq=>l, b=>g, fg=>NULL';
 
+-- fast iteration over keys
+do $$
+declare
+  key text;
+  value text;
+begin
+  foreach key, value in array hstore_to_array('aa=>1, cq=>l, b=>g, fg=>NULL'::hstore)
+  loop
+    raise notice 'key: %, value: %', key, value;
+  end loop;
+end;
+$$;
+
 select hstore_to_matrix('aa=>1, cq=>l, b=>g, fg=>NULL'::hstore);
 select %# 'aa=>1, cq=>l, b=>g, fg=>NULL';
 
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index d36acf6..e4abb97 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -2505,6 +2505,29 @@ NOTICE:  row = {7,8,9}
 NOTICE:  row = {10,11,12}
 </programlisting>
     </para>
+
+    <para>
+     <literal>FOREACH</> cycle can be used for iteration over record. You
+     need a <xref linkend="hstore"> extension. For this case a clause
+     <literal>SLICE</literal> should not be used. <literal>FOREACH</literal>
+     statements supports list of target variables. When source array is
+     a array of composites, then composite array element is saved to target
+     variables. When the array is a array of scalar values, then target 
+     variables are filled item by item.
+<programlisting>
+CREATE FUNCTION trig_function() RETURNS TRIGGER AS $$
+DECLARE
+  key text; value text;
+BEGIN
+  FOREACH key, value IN ARRAY hstore_to_array(hstore(NEW))
+  LOOP
+    RAISE NOTICE 'key = %, value = %', key, value;
+  END LOOP;
+  RETURN NEW;
+END;
+$$ LANGUAGE plpgsql;
+</programlisting>
+    </para>
    </sect2>
 
    <sect2 id="plpgsql-error-trapping">
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index deefb1f..5c34a03 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -2261,6 +2261,9 @@ exec_stmt_foreach_a(PLpgSQL_execstate *estate, PLpgSQL_stmt_foreach_a *stmt)
 	Datum		value;
 	bool		isnull;
 
+
+	bool		multiassign = false;
+
 	/* get the value of the array expression */
 	value = exec_eval_expr(estate, stmt->expr, &isnull, &arrtype, &arrtypmod);
 	if (isnull)
@@ -2322,6 +2325,21 @@ exec_stmt_foreach_a(PLpgSQL_execstate *estate, PLpgSQL_stmt_foreach_a *stmt)
 				(errcode(ERRCODE_DATATYPE_MISMATCH),
 			  errmsg("FOREACH loop variable must not be of an array type")));
 
+	/*
+	 * it is multiassign? Don't support slicing yet.
+	 */
+	if (loop_var->dtype == PLPGSQL_DTYPE_ROW
+		 && !type_is_rowtype(ARR_ELEMTYPE(arr)))
+	{
+		if (stmt->slice != 0)
+			ereport(ERROR,
+					(errcode(ERRCODE_DATATYPE_MISMATCH),
+				  errmsg("cannot to assign non composite value to composite variable")));
+
+		/* only when target var is composite, SLICE=0 and source is scalar */
+		multiassign = true;
+	}
+
 	/* Create an iterator to step through the array */
 	array_iterator = array_create_iterator(arr, stmt->slice, NULL);
 
@@ -2344,13 +2362,45 @@ exec_stmt_foreach_a(PLpgSQL_execstate *estate, PLpgSQL_stmt_foreach_a *stmt)
 	{
 		found = true;			/* looped at least once */
 
-		/* Assign current element/slice to the loop variable */
-		exec_assign_value(estate, loop_var, value, isnull,
-						  iterator_result_type, iterator_result_typmod);
+		if (!multiassign)
+		{
+			/* Assign current element/slice to the loop variable */
+			exec_assign_value(estate, loop_var, value, isnull,
+							  iterator_result_type, iterator_result_typmod);
+
+			/* In slice case, value is temporary; must free it to avoid leakage */
+			if (stmt->slice > 0)
+				pfree(DatumGetPointer(value));
+		}
+		else
+		{
+			int	i;
+			bool	first = true;
+			PLpgSQL_row *row = (PLpgSQL_row *) loop_var;
+
+			for (i = 0; i < row->nfields; i++)
+			{
+				int		varno = row->varnos[i];
 
-		/* In slice case, value is temporary; must free it to avoid leakage */
-		if (stmt->slice > 0)
-			pfree(DatumGetPointer(value));
+				if (varno != -1)
+				{
+					PLpgSQL_datum *var = (PLpgSQL_datum *) (estate->datums[varno]);
+
+					if (!first)
+					{
+						if (!array_iterate(array_iterator, &value, &isnull))
+							ereport(ERROR,
+									(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+								   errmsg("array is not well sized, missing data")));
+					}
+					else
+						first = false;
+
+					exec_assign_value(estate, var, value, isnull,
+								  iterator_result_type, iterator_result_typmod);
+				}
+			}
+		}
 
 		/*
 		 * Execute the statements
diff --git a/src/test/regress/expected/plpgsql.out b/src/test/regress/expected/plpgsql.out
index 78e5a85..92d448d 100644
--- a/src/test/regress/expected/plpgsql.out
+++ b/src/test/regress/expected/plpgsql.out
@@ -5127,6 +5127,59 @@ NOTICE:  {"(35,78)","(88,76)"}
 
 drop function foreach_test(anyarray);
 drop type xy_tuple;
+-- multiassign (key,value) tests
+create or replace function foreach_test_ab(anyarray)
+returns void as $$
+declare
+  a text; b text;
+begin
+  foreach a,b in array $1
+  loop
+    raise notice 'a: %, b: %', a, b;
+  end loop;
+end
+$$ language plpgsql;
+select foreach_test_ab(array[1,2,3,4]);
+NOTICE:  a: 1, b: 2
+NOTICE:  a: 3, b: 4
+ foreach_test_ab 
+-----------------
+ 
+(1 row)
+
+select foreach_test_ab(array[[1,2],[3,4]]);
+NOTICE:  a: 1, b: 2
+NOTICE:  a: 3, b: 4
+ foreach_test_ab 
+-----------------
+ 
+(1 row)
+
+select foreach_test_ab(array[[1,2,3]]);
+NOTICE:  a: 1, b: 2
+ERROR:  array is not well sized, missing data
+CONTEXT:  PL/pgSQL function foreach_test_ab(anyarray) line 5 at FOREACH over array
+select foreach_test_ab('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]);
+NOTICE:  a: 1, b: 2
+NOTICE:  a: 3, b: 4
+NOTICE:  a: 5, b: 6
+NOTICE:  a: 7, b: 8
+ foreach_test_ab 
+-----------------
+ 
+(1 row)
+
+select foreach_test_ab(array[null,null, 1,null, 1,1, null,1]);
+NOTICE:  a: <NULL>, b: <NULL>
+NOTICE:  a: 1, b: <NULL>
+NOTICE:  a: 1, b: 1
+NOTICE:  a: <NULL>, b: 1
+ foreach_test_ab 
+-----------------
+ 
+(1 row)
+
+drop function foreach_test_ab(anyarray);
 --
 -- Assorted tests for array subscript assignment
 --
diff --git a/src/test/regress/sql/plpgsql.sql b/src/test/regress/sql/plpgsql.sql
index e19e415..7640f5d 100644
--- a/src/test/regress/sql/plpgsql.sql
+++ b/src/test/regress/sql/plpgsql.sql
@@ -4075,6 +4075,27 @@ select foreach_test(ARRAY[[(10,20),(40,69)],[(35,78),(88,76)]]::xy_tuple[]);
 drop function foreach_test(anyarray);
 drop type xy_tuple;
 
+-- multiassign (key,value) tests
+create or replace function foreach_test_ab(anyarray)
+returns void as $$
+declare
+  a text; b text;
+begin
+  foreach a,b in array $1
+  loop
+    raise notice 'a: %, b: %', a, b;
+  end loop;
+end
+$$ language plpgsql;
+
+select foreach_test_ab(array[1,2,3,4]);
+select foreach_test_ab(array[[1,2],[3,4]]);
+select foreach_test_ab(array[[1,2,3]]);
+select foreach_test_ab('{{{1,2},{3,4}},{{5,6},{7,8}}}'::int[]);
+select foreach_test_ab(array[null,null, 1,null, 1,1, null,1]);
+
+drop function foreach_test_ab(anyarray);
+
 --
 -- Assorted tests for array subscript assignment
 --


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic