Prevent automatic oid assignment when in binary upgrade mode. Also
throw an error when contrib/pg_upgrade_support functions are called when
not in binary upgrade mode.
This prevent automatically-assigned oids from conflicting with later
pre-assigned oids coming from the old cluster. It also makes sure oids
are preserved in call important cases.
Some of the many error messages introduced in 458857cc missed 'FROM
unpackaged'. Also e016b724 and 45ffeb7e forgot to quote extension
version numbers.
Backpatch to 9.1, just like 458857cc which introduced the messages. Do
so because the error messages thrown when the wrong command is copy &
pasted aren't easy to understand.
We also don't track PREPARE, nor do we track planning time in general, so
let's ignore DEALLOCATE as well for consistency.
Backpatch to 9.4, but not further than that. Although it seems unlikely that
anyone is relying on the current behavior, this is a behavioral change.
Fabien Coelho
The new column shows how many backends have a buffer pinned. That can
be useful during development or to diagnose production issues
e.g. caused by vacuum waiting for cleanup locks.
To handle upgrades transparently - the extension might be used in
views - deal with callers expecting the old number of columns.
Reviewed by Fujii Masao and Rajeev rastogi.
strncmp() is a specialized API unsuited for routine copying into
fixed-size buffers. On a system where the length of a single filename
can exceed MAXPGPATH, the pg_archivecleanup change prevents a simple
crash in the subsequent strlen(). Few filesystems support names that
long, and calling pg_archivecleanup with untrusted input is still not a
credible use case. Therefore, no back-patch.
David Rowley
Now benchmarking options such as -c cannot be used if initializing
option (-i) is specified. Also initializing options such as -F cannot
be used if initializing option is not specified.
Tatsuo Ishii and Fabien COELHO.
Previously, TOAST tables only required in the new cluster could cause
oid conflicts if they were auto-numbered and a later conflicting oid had
to be assigned.
Backpatch through 9.3
autovacuum_multixact_freeze_max_age was added as a pg_ctl start
parameter in 9.3.X to prevent autovacuum from running. However, only
some 9.3.X releases have autovacuum_multixact_freeze_max_age as it was
added in a minor PG 9.3 release. It also isn't needed because -b turns
off autovacuum in 9.1+.
Without this fix, trying to upgrade from an early 9.3 release to 9.4
would fail.
Report by EDB
Backpatch through 9.3
Commit 0ac5ad5134 removed an optimization in multixact.c that skipped
fetching members of MultiXactId that were older than our
OldestVisibleMXactId value. The reason this was removed is that it is
possible for multixacts that contain updates to be older than that
value. However, if the caller is certain that the multi does not
contain an update (because the infomask bits say so), it can pass this
info down to GetMultiXactIdMembers, enabling it to use the old
optimization.
Pointed out by Andres Freund in 20131121200517.GM7240@alap2.anarazel.de
get_raw_page tried to validate the supplied block number against
RelationGetNumberOfBlocks(), which of course is only right when
accessing the main fork. In most cases, the main fork is longer
than the others, so that the check was too weak (allowing a
lower-level error to be reported, but no real harm to be done).
However, very small tables could have an FSM larger than their heap,
in which case the mistake prevented access to some FSM pages.
Per report from Torsten Foertsch.
In passing, make the bad-block-number error into an ereport not elog
(since it's certainly not an internal error); and fix sloppily
maintained comment for RelationGetNumberOfBlocksInFork.
This has been wrong since we invented relation forks, so back-patch
to all supported branches.
With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure" based on
the compile-time OpenLDAP version number, and test the crash scenario in
the dblink test suite. Back-patch to 9.0 (all supported versions).
ws2_32 is the new version of the library that should be used, as
it contains the require functionality from wsock32 as well as some
more (which is why some binaries were already using ws2_32).
Michael Paquier, reviewed by MauMau
Prominent binaries already had this metadata. A handful of minor
binaries, such as pg_regress.exe, still lack it; efforts to eliminate
such exceptions are welcome.
Michael Paquier, reviewed by MauMau.
Commit 1b86c81d2d fixed the decoding of toasted columns for the rows
contained in one xl_heap_multi_insert record. But that's not actually
enough, because heap_multi_insert() will actually first toast all
passed in rows and then emit several *_multi_insert records; one for
each page it fills with tuples.
Add a XLOG_HEAP_LAST_MULTI_INSERT flag which is set in
xl_heap_multi_insert->flag denoting that this multi_insert record is
the last emitted by one heap_multi_insert() call. Then use that flag
in decode.c to only set clear_toast_afterwards in the right situation.
Expand the number of rows inserted via COPY in the corresponding
regression test to make sure that more than one heap page is filled
with tuples by one heap_multi_insert() call.
Backpatch to 9.4 like the previous commit.
This command provides an automated way to create foreign table definitions
that match remote tables, thereby reducing tedium and chances for error.
In this patch, we provide the necessary core-server infrastructure and
implement the feature fully in the postgres_fdw foreign-data wrapper.
Other wrappers will throw a "feature not supported" error until/unless
they are updated.
Ronan Dunklau and Michael Paquier, additional work by me
Previously, when calculations on the need for toast tables changed,
pg_upgrade could not handle cases where the new cluster needed a TOAST
table and the old cluster did not. (It already handled the opposite
case.) This fixes the "OID mismatch" error typically generated in this
case.
Backpatch through 9.2
When decoding the results of a HEAP2_MULTI_INSERT (currently only
generated by COPY FROM) toast columns for all but the last tuple
weren't replaced by their actual contents before being handed to the
output plugin. The reassembled toast datums where disregarded after
every REORDER_BUFFER_CHANGE_(INSERT|UPDATE|DELETE) which is correct
for plain inserts, updates, deletes, but not multi inserts - there we
generate several REORDER_BUFFER_CHANGE_INSERTs for a single
xl_heap_multi_insert record.
To solve the problem add a clear_toast_afterwards boolean to
ReorderBufferChange's union member that's used by modifications. All
row changes but multi_inserts always set that to true, but
multi_insert sets it only for the last change generated.
Add a regression test covering decoding of multi_inserts - there was
none at all before.
Backpatch to 9.4 where logical decoding was introduced.
Bug found by Petr Jelinek.
The isxdigit() calls relied on undefined behavior. The isascii() call
was well-defined, but our prevailing style is to include the cast.
Back-patch to 9.4, where the isxdigit() calls were introduced.
Historically these database properties could be manipulated only by
manually updating pg_database, which is error-prone and only possible for
superusers. But there seems no good reason not to allow database owners to
set them for their databases, so invent CREATE/ALTER DATABASE options to do
that. Adjust a couple of places that were doing it the hard way to use the
commands instead.
Vik Fearing, reviewed by Pavel Stehule
The output buffer size in unaccent_lexize() was calculated as input string
length times pg_database_encoding_max_length(), which effectively assumes
that replacement strings aren't more than one character. While that was
all that we previously documented it to support, the code actually has
always allowed replacement strings of arbitrary length; so if you tried
to make use of longer strings, you were at risk of buffer overrun. To fix,
use an expansible StringInfo buffer instead of trying to determine the
maximum space needed a-priori.
This would be a security issue if unaccent rules files could be installed
by unprivileged users; but fortunately they can't, so in the back branches
the problem can be labeled as improper configuration by a superuser.
Nonetheless, a memory stomp isn't a nice way of reacting to improper
configuration, so let's back-patch the fix.
We were already issuing a WARNING, albeit only elog not ereport, for
duplicate source strings; so warning rather than just being stoically
silent seems like the best thing to do here. Arguably both of these
complaints should be upgraded to ERRORs, but that might be more
behavioral change than people want.
Note: the faulty line is already printed via an errcontext hook,
so there's no need for more information than these messages provide.
This could be useful in languages where diacritic signs are represented as
separate characters; more generally it supports using unaccent dictionaries
for substring substitutions beyond narrowly conceived "diacritic removal".
In any case, since the rule-file parser doesn't complain about
multi-character source strings, it behooves us to do something unsurprising
with them.
This is useful in languages where diacritic signs are represented as
separate characters; it's also one step towards letting unaccent be used
for arbitrary substring substitutions.
In passing, improve the user documentation for unaccent, which was sadly
vague about some important details.
Mohammad Alhashash, reviewed by Abhijit Menon-Sen
There were some C comments that hadn't been updated from the switch of
using only pg_dumpall to using pg_dump and pg_dumpall, so update them.
Also, don't bother using --schema-only for pg_dumpall --globals-only.
Backpatch through 9.4
This function continued to use it after heap_endscan() freed it. In
passing, don't explicit create a strategy here. Instead, use the one
created by heap_beginscan_strat(), if any. Back-patch to 9.2, where use
of a BufferAccessStrategy here was introduced.
This fixes a bug that caused vacuum to fail when the '0000' files left
by initdb were accessed as part of vacuum's cleanup of old pg_multixact
files.
Backpatch through 9.3
dblink uses a short-lived data conversion memory context. However it
was not deleted when no longer needed, leading to a noticeable memory
leak under some circumstances. Plug the hole, along with minor
refactoring. Backpatch to 9.2 where the leak was introduced.
Report and initial patch by MauMau. Reviewed/modified slightly by
Tom Lane and me.
Give passwords to each user created in support of an ECPG connection
test case. Use SET SESSION AUTHORIZATION, not a fresh connection, to
reduce privileges during a dblink test case.
To test against such a server, both the "make installcheck-world"
environment and the postmaster environment must provide the default
user's password; $PGPASSFILE is the principal way to do so. (The
postmaster environment needs it for dblink and postgres_fdw tests.)
Arrange for postmaster child processes to respond to two environment
variables, PG_OOM_ADJUST_FILE and PG_OOM_ADJUST_VALUE, to determine whether
they reset their OOM score adjustments and if so to what. This is superior
to the previous design involving #ifdef's in several ways. The behavior is
now available in a default build, and both ends of the adjustment --- the
original adjustment of the postmaster's level and the subsequent
readjustment by child processes --- can now be controlled in one place,
namely the postmaster launch script. So it's no longer necessary for the
launch script to act on faith that the server was compiled with the
appropriate options. In addition, if someone wants to use an OOM score
other than zero for the child processes, that doesn't take a recompile
anymore; and we no longer have to cater separately to the two different
historical kernel APIs for this adjustment.
Gurjeet Singh, somewhat revised by me
This SQL-standard feature allows a sub-SELECT yielding multiple columns
(but only one row) to be used to compute the new values of several columns
to be updated. While the same results can be had with an independent
sub-SELECT per column, such a workaround can require a great deal of
duplicated computation.
The standard actually says that the source for a multi-column assignment
could be any row-valued expression. The implementation used here is
tightly tied to our existing sub-SELECT support and can't handle other
cases; the Bison grammar would have some issues with them too. However,
I don't feel too bad about this since other cases can be converted into
sub-SELECTs. For instance, "SET (a,b,c) = row_valued_function(x)" could
be written "SET (a,b,c) = (SELECT * FROM row_valued_function(x))".
Since most of the system thinks AND and OR are N-argument expressions
anyway, let's have the grammar generate a representation of that form when
dealing with input like "x AND y AND z AND ...", rather than generating
a deeply-nested binary tree that just has to be flattened later by the
planner. This avoids stack overflow in parse analysis when dealing with
queries having more than a few thousand such clauses; and in any case it
removes some rather unsightly inconsistencies, since some parts of parse
analysis were generating N-argument ANDs/ORs already.
It's still possible to get a stack overflow with weirdly parenthesized
input, such as "x AND (y AND (z AND ( ... )))", but such cases are not
mainstream usage. The maximum depth of parenthesization is already
limited by Bison's stack in such cases, anyway, so that the limit is
probably fairly platform-independent.
Patch originally by Gurjeet Singh, heavily revised by me
Any OS user able to access the socket can connect as the bootstrap
superuser and proceed to execute arbitrary code as the OS user running
the test. Protect against that by placing the socket in a temporary,
mode-0700 subdirectory of /tmp. The pg_regress-based test suites and
the pg_upgrade test suite were vulnerable; the $(prove_check)-based test
suites were already secure. Back-patch to 8.4 (all supported versions).
The hazard remains wherever the temporary cluster accepts TCP
connections, notably on Windows.
As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR. Popular non-default values
like /var/run/postgresql are often unwritable to the build user.
Security: CVE-2014-0067
187492b6c2 changed pgstat.c so that
the stats files were saved into $PGDATA/pg_stat directory when the server
was shutdowned. But it accidentally forgot to change the location of
pg_stat_statements permanent stats file. This commit fixes pg_stat_statements
so that its stats file is also saved into $PGDATA/pg_stat at shutdown.
Since this fix changes the file layout, we don't back-patch it to 9.3
where this oversight was introduced.
The original coding in contrib/uuid-ossp created and destroyed a uuid_t
object (or, in some cases, even two of them) each time it was called.
This is not the intended usage: you're supposed to keep the uuid_t object
around so that the library can cache its state across uses. (Other UUID
libraries seem to keep equivalent state behind-the-scenes in static
variables, but OSSP chose differently.) Aside from being quite inefficient,
creating a new uuid_t loses knowledge of the previously generated UUID,
which in theory could result in duplicate V1-style UUIDs being created
on sufficiently fast machines.
On at least some platforms, creating a new uuid_t also draws some entropy
from /dev/urandom, leaving less for the rest of the system. This seems
sufficiently unpleasant to justify back-patching this change.
The previous version of these tests expected uuid_generate_v1() to always
emit MAC addresses with the local-admin and multicast address bits zero.
However, several of the buildfarm critters are reporting values with the
local-admin bit set. (Perhaps they're running inside VMs or jails.)
And a couple are reporting values with the multicast bit set, probably
meaning that the UUID library couldn't read the system MAC address.
Also, it emerges that if OSSP UUID can't read the system MAC address, it
falls back to V1MC behavior wherein the whole node field gets randomized
each time, breaking the test that expected the node field to remain stable
in V1 output. (It looks like e2fs doesn't behave that way, though.)
It's not entirely clear why we can't get a system MAC address, since the
buildfarm scripts would not work without internet access. Nonetheless,
the regression tests had better cope with the case, so adjust the tests
to expect these behaviors.
This reverts commit 45b7abe59e.
It turns out that the %name-prefix syntax without "=" does not work
at all in pre-2.4 Bison. We are not prepared to make such a large
jump in minimum required Bison version just to suppress a warning
message in a version hardly any developers are using yet.
When 3.0 gets more popular, we'll figure out a way to deal with this.
In the meantime, BISONFLAGS=-Wno-deprecated is recommendable for
anyone using 3.0 who doesn't want to see the warning.
%name-prefix doesn't use an "=" sign according to the Bison docs, but it
silently accepted one anyway, until Bison 3.0. This was originally a
typo of mine in commit 012abebab1, and we
seem to have slavishly copied the error into all the other grammar files.
Per report from Vik Fearing; analysis by Peter Eisentraut.
Back-patch to all active branches, since somebody might try to build
a back branch with up-to-date tools.
On reflection, the timestamp-advances test might fail if we're unlucky
enough for the time_mid field to change between two calls, since uuid_cmp
is just bytewise comparison and the field ordering has more significant
fields later. Build some field extraction functions so we can do a more
honest test of that. Also check that the version and reserved fields
contain what they should.
The V5 (SHA1 hashing) code wrote 20 bytes into a 16-byte local variable.
This had accidentally failed to fail in my testing and Matteo's, but
buildfarm results exposed the problem.
Allow the contrib/uuid-ossp extension to be built atop any one of these
three popular UUID libraries. (The extension's name is now arguably a
misnomer, but we'll keep it the same so as not to cause unnecessary
compatibility issues for users.)
We would not normally consider a change like this post-beta1, but the issue
has been forced by our upgrade to autoconf 2.69, whose more rigorous header
checks are causing OSSP's header files to be rejected on some platforms.
It's been foreseen for some time that we'd have to move away from depending
on OSSP UUID due to lack of upstream maintenance, so this is a down payment
on that problem.
While at it, add some simple regression tests, in hopes of catching any
major incompatibilities between the three implementations.
Matteo Beccati, with some further hacking by me
Commit 090d0f2050 added new code showing
how it can be useful to set bgw_notify_pid to a non-zero value, but it
failed to make sure that the existing call to RegisterBackgroundWorker
initialized the new field at all.
Report and patch by Shigeru Hanada.
On Mingw, it seems that scanf() doesn't necessarily accept the same format
codes that printf() does, and in particular it may fail to recognize %llu
even though printf() does. Since configure only probes printf() behavior
while setting up the INT64_FORMAT macros, this means it's unsafe to use
those macros with scanf(). We had only one instance of such a coding
pattern, in contrib/pg_stat_statements, so change that code to avoid
the problem.
Per buildfarm warnings. Back-patch to 9.0 where the troublesome code
was introduced.
Michael Paquier
Change the total-transactions counters from int32 to int64 to accommodate
cases where we do more than 2^31 transactions during a run. This patch
does not change the INT_MAX limit on explicit "-t" parameters, but it
does allow the product of the -t and -c parameters to exceed INT_MAX, or
allow a -T limit that is large enough that more than 2^31 transactions
can be completed. While pgbench did not actually fail in such cases,
it did print an incorrect total-transactions count, and some of the
derived numbers such as TPS would have been wrong as well.
Tomas Vondra
C89 says that compound initializers may only contain constant expressions;
a restriction violated by commit 89d00cbe. While we've had no actual field
complaints about this, C89 is still the project standard, and it's not
saving all that much code to break compatibility here. So let's adhere to
the old restriction.
In passing, replace a bunch of hardwired constants "256" with
sizeof(target-variable), just because the latter is more readable and
less breakable. And const-ify where possible.
Back-patch to 9.3 where the nonportable code was added.
Andres Freund and Tom Lane
gbt_macad_union also allocated 12-byte structs where we really need 16.
Per report from Andres Freund. No back-patch since there's no current
risk of a real problem.
The macaddr opclass stores two macaddr structs (each of size 6) in an
index column that's declared as being of type gbtreekey16, ie 16 bytes.
In the original coding this led to passing a palloc'd value of size 12
to the index insertion code, so that data would be fetched past the
end of the allocated value during index tuple construction. This makes
valgrind unhappy. In principle it could result in a SIGSEGV, though
with the current implementation of palloc there's no risk since
the 12-byte request size would be rounded up to 16 bytes anyway.
To fix, add a field to struct gbtree_ninfo showing the declared size of
the index datums, and use that in the palloc requests; and use palloc0
to be sure that any wasted bytes are cleanly initialized.
Per report from Andres Freund. No back-patch since there's no current
risk of a real problem.
pg_stat_replication shows connected replication clients. The ddl test case
never has any replication clients connected, so querying pg_stat_replication
is pointless. To check that a slot has been dropped correctly, query
pg_replication_slots instead.
Andres Freund
The code expands a varbit gist leaf key to a node key by copying the bit
data twice in a varlen datum, as both the lower and upper key. The lower key
was expanded to INTALIGN size, but the padding bytes were not initialized.
That's a problem because when the lower/upper keys are compared, the padding
bytes are used compared too, when the values are otherwise equal. That could
lead to incorrect query results.
REINDEX is advised for any btree_gist indexes on bit or bit varying data
type, to fix any garbage padding bytes on disk.
Per Valgrind, reported by Andres Freund. Backpatch to all supported
versions.
It's easy to forget using SYSTEMQUOTEs when constructing command strings
for system() or popen(). Even if we fix all the places missing it now, it is
bound to be forgotten again in the future. Introduce wrapper functions that
do the the extra quoting for you, and get rid of SYSTEMQUOTEs in all the
callers.
We previosly used SYSTEMQUOTEs in all the hard-coded command strings, and
this doesn't change the behavior of those. But user-supplied commands, like
archive_command, restore_command, COPY TO/FROM PROGRAM calls, as well as
pgbench's \shell, will now gain an extra pair of quotes. That is desirable,
but if you have existing scripts or config files that include an extra
pair of quotes, those might need to be adjusted.
Reviewed by Amit Kapila and Tom Lane
Commit a730183926 created rather a mess by
putting dependencies on backend-only include files into include/common.
We really shouldn't do that. To clean it up:
* Move TABLESPACE_VERSION_DIRECTORY back to its longtime home in
catalog/catalog.h. We won't consider this symbol part of the FE/BE API.
* Push enum ForkNumber from relfilenode.h into relpath.h. We'll consider
relpath.h as the source of truth for fork numbers, since relpath.c was
already partially serving that function, and anyway relfilenode.h was
kind of a random place for that enum.
* So, relfilenode.h now includes relpath.h rather than vice-versa. This
direction of dependency is fine. (That allows most, but not quite all,
of the existing explicit #includes of relpath.h to go away again.)
* Push forkname_to_number from catalog.c to relpath.c, just to centralize
fork number stuff a bit better.
* Push GetDatabasePath from catalog.c to relpath.c; it was rather odd
that the previous commit didn't keep this together with relpath().
* To avoid needing relfilenode.h in common/, redefine the underlying
function (now called GetRelationPath) as taking separate OID arguments,
and make the APIs using RelFileNode or RelFileNodeBackend into macro
wrappers. (The macros have a potential multiple-eval risk, but none of
the existing call sites have an issue with that; one of them had such a
risk already anyway.)
* Fix failure to follow the directions when "init" fork type was added;
specifically, the errhint in forkname_to_number wasn't updated, and neither
was the SGML documentation for pg_relation_size().
* Fix tablespace-path-too-long check in CreateTableSpace() to account for
fork-name component of maximum-length pathnames. This requires putting
FORKNAMECHARS into a header file, but it was rather useless (and
actually unreferenced) where it was.
The last couple of items are potentially back-patchable bug fixes,
if anyone is sufficiently excited about them; but personally I'm not.
Per a gripe from Christoph Berg about how include/common wasn't
self-contained.
Some popen() calls were missing SYSTEMQUOTEs, which caused initdb and
pg_upgrade to fail on Windows, if the installation path contained both
spaces and @ signs.
Patch by Nikhil Deshpande. Backpatch to all supported versions.
pgss_post_parse_analyze() neglected to pass the call on to any earlier
occupant of the post_parse_analyze_hook. There are no other users of that
hook in contrib/, and most likely none in the wild either, so this is
probably just a latent bug. But it's a bug nonetheless, so back-patch
to 9.2 where this code was introduced.
Because of gcc -Wmissing-prototypes, all functions in dynamically
loadable modules must have a separate prototype declaration. This is
meant to detect global functions that are not declared in header files,
but in cases where the function is called via dfmgr, this is redundant.
Besides filling up space with boilerplate, this is a frequent source of
compiler warnings in extension modules.
We can fix that by creating the function prototype as part of the
PG_FUNCTION_INFO_V1 macro, which such modules have to use anyway. That
makes the code of modules cleaner, because there is one less place where
the entry points have to be listed, and creates an additional check that
functions have the right prototype.
Remove now redundant prototypes from contrib and other modules.
Specifically, on-stack memset() might be removed, so:
* Replace memset() with px_memset()
* Add px_memset to copy_crlf()
* Add px_memset to pgp-s2k.c
Patch by Marko Kreen
Report by PVS-Studio
Backpatch through 8.4.
Non-existent tablespace directory references can occur if user
tablespaces are created inside data directories and the data directory
is renamed in preparation for running pg_upgrade, and the symbolic links
are not updated.
Backpatch to 9.3.
We were emitting "(SELECT null::typename)", which is usually interpreted
as a scalar subselect, but not so much in the context "x = ANY(...)".
This led to remote-side parsing failures when remote_estimate is enabled.
A quick and ugly fix is to stick in an extra cast step,
"((SELECT null::typename)::typename)". The cast will be thrown away as
redundant by parse analysis, but not before it's done its job of making
sure the grammar sees the ANY argument as an a_expr rather than a
select_with_parens. Per an example from Hannu Krosing.
Add vacuumdb option --analyze-in-stages which runs ANALYZE three times
with different configuration settings, adopting the logic from the
analyze_new_cluster.sh script that pg_upgrade generates. That way,
users of pg_dump/pg_restore can also use that functionality.
Change pg_upgrade to create the script so that it calls vacuumdb instead
of implementing the logic itself.
When extracting trigrams from a regular expression for search of a GIN or
GIST trigram index, it's useful to penalize (preferentially discard)
trigrams that contain whitespace, since those are typically far more common
in the index than trigrams not containing whitespace. Of course, this
should only be a preference not a hard rule, since we might otherwise end
up with no trigrams to search for. The previous coding tended to produce
fairly inefficient trigram search sets for anchored regexp patterns, as
reported by Erik Rijkers. This patch penalizes whitespace-containing
trigrams, and also reduces the target number of extracted trigrams, since
experience suggests that the original coding tended to select too many
trigrams to search for.
Alexander Korotkov, reviewed by Tom Lane
For variadic functions (other than VARIADIC ANY), the syntaxes foo(x,y,...)
and foo(VARIADIC ARRAY[x,y,...]) should be considered equivalent, since the
former is converted to the latter at parse time. They have indeed been
equivalent, in all releases before 9.3. However, commit 75b39e790 made an
ill-considered decision to record which syntax had been used in FuncExpr
nodes, and then to make equal() test that in checking node equality ---
which caused the syntaxes to not be seen as equivalent by the planner.
This is the underlying cause of bug #9817 from Dmitry Ryabov.
It might seem that a quick fix would be to make equal() disregard
FuncExpr.funcvariadic, but the same commit made that untenable, because
the field actually *is* semantically significant for some VARIADIC ANY
functions. This patch instead adopts the approach of redefining
funcvariadic (and aggvariadic, in HEAD) as meaning that the last argument
is a variadic array, whether it got that way by parser intervention or was
supplied explicitly by the user. Therefore the value will always be true
for non-ANY variadic functions, restoring the principle of equivalence.
(However, the planner will continue to consider use of VARIADIC as a
meaningful difference for VARIADIC ANY functions, even though some such
functions might disregard it.)
In HEAD, this change lets us simplify the decompilation logic in
ruleutils.c, since the funcvariadic/aggvariadic flag tells directly whether
to print VARIADIC. However, in 9.3 we have to continue to cope with
existing stored rules/views that might contain the previous definition.
Fortunately, this just means no change in ruleutils.c, since its existing
behavior effectively ignores funcvariadic for all cases other than VARIADIC
ANY functions.
In HEAD, bump catversion to reflect the fact that FuncExpr.funcvariadic
changed meanings; this is sort of pro forma, since I don't believe any
built-in views are affected.
Unfortunately, this patch doesn't magically fix everything for affected
9.3 users. After installing 9.3.5, they might need to recreate their
rules/views/indexes containing variadic function calls in order to get
everything consistent with the new definition. As in the cited bug,
the symptom of a problem would be failure to use a nominally matching
index that has a variadic function call in its definition. We'll need
to mention this in the 9.3.5 release notes.
contrib/test_decoding's "make check" runs two sets of tests. Unless we
specify separate output directories for each set the isolation tests
will overwrite the output from the normal regression set. Doing this
will help the buildfarm collect complete logs.
Any OS user able to access the socket can connect as the bootstrap
superuser and in turn execute arbitrary code as the OS user running the
test. Protect against that by placing the socket in the temporary data
directory, which has mode 0700 thanks to initdb. Back-patch to 8.4 (all
supported versions). The hazard remains wherever the temporary cluster
accepts TCP connections, notably on Windows.
Attempts to run "make check" from a directory with a long name will now
fail. An alternative not sharing that problem was to place the socket
in a subdirectory of /tmp, but that is only secure if /tmp is sticky.
The PG_REGRESS_SOCK_DIR environment variable is available as a
workaround when testing from long directory paths.
As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR. Popular non-default values
like /var/run/postgresql are often unwritable to the build user.
Security: CVE-2014-0067
The new format accepts exactly the same data as the json type. However, it is
stored in a format that does not require reparsing the orgiginal text in order
to process it, making it much more suitable for indexing and other operations.
Insignificant whitespace is discarded, and the order of object keys is not
preserved. Neither are duplicate object keys kept - the later value for a given
key is the only one stored.
The new type has all the functions and operators that the json type has,
with the exception of the json generation functions (to_json, json_agg etc.)
and with identical semantics. In addition, there are operator classes for
hash and btree indexing, and two classes for GIN indexing, that have no
equivalent in the json type.
This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which
was intended to provide similar facilities to a nested hstore type, but which
in the end proved to have some significant compatibility issues.
Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan.
Review: Andres Freund
This covers all the SQL-standard trigger types supported for regular
tables; it does not cover constraint triggers. The approach for
acquiring the old row mirrors that for view INSTEAD OF triggers. For
AFTER ROW triggers, we spool the foreign tuples to a tuplestore.
This changes the FDW API contract; when deciding which columns to
populate in the slot returned from data modification callbacks, writable
FDWs will need to check for AFTER ROW triggers in addition to checking
for a RETURNING clause.
In support of the feature addition, refactor the TriggerFlags bits and
the assembly of old tuples in ModifyTable.
Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
Clear errno before calling readdir() and handle old MinGW errno bug
while adding full test coverage for readdir/closedir failures.
Backpatch through 8.4.
I discovered the hard way that on some old shells, the locution
FOO="" unset FOO
does not behave the same as
FOO=""; unset FOO
and in fact leaves FOO set to an empty string. test.sh was inconsistently
spelling it different ways on adjacent lines.
This got broken relatively recently, in commit c737a2e56, so the lack of
field reports to date doesn't represent a lot of evidence that the problem
is rare.
In b89e151054 I had assumed it was ok to use anonymous unions as
struct members, but while a longstanding extension in many compilers,
it's only been standardized in C11.
To fix, remove one of the anonymous unions which tried to hide some
implementation specific enum values and give the other a name. The
latter unfortunately requires changes in output plugins, but since the
feature has only been added a few days ago...
Andres Freund
The previous coding supposed that it could consider just a single join
condition in any one parameterized path for the foreign table. But in
reality, the parameterized-path machinery forces all join clauses that are
"movable to" the foreign table to be evaluated at that node; including
clauses that we might not consider safe to send across. Such cases would
result in an Assert failure in an assert-enabled build, and otherwise in
sending an unsafe clause to the foreign server, which might result in
errors or silently-wrong answers. A lesser problem was that the
cost/rowcount estimates generated for the parameterized path failed to
account for any additional join quals that get assigned to the scan.
To fix, rewrite postgresGetForeignPaths so that it correctly collects all
the movable quals for any one outer relation when generating parameterized
paths; we'll now generate just one path per outer relation not one per join
qual. Also fix bogus assumptions in postgresGetForeignPlan and
estimate_path_cost_size that only safe-to-send join quals will be
presented.
Based on complaint from Etsuro Fujita that the path costs were being
miscalculated, though this is significantly different from his proposed
patch.
Commit 6f37c08057 removed whitespace
from the SQL file but not the expected-output file, and commit
7e8db2dc42 changed the error message
without updating the expected outputs.
This forces an input field containing the quoted null string to be
returned as a NULL. Without this option, only unquoted null strings
behave this way. This helps where some CSV producers insist on quoting
every field, whether or not it is needed. The option takes a list of
fields, and only applies to those columns. There is an equivalent
column-level option added to file_fdw.
Ian Barwick, with some tweaking by Andrew Dunstan, reviewed by Payal
Singh.
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
Testing convert_to(..., 'ISO-8859-1') fails if there isn't a conversion
function available from the database encoding to ISO-8859-1. This has
been broken since day one, but the breakage was hidden by
pg_do_encoding_conversion's failure to complain, up till commit
49c817eab7.
Since the data being converted in this test is plain ASCII, no actual
conversion need happen (and if it did, it would prove little about citext
anyway). So that we still have some code coverage of the convert() family
of functions, let's switch to using convert_from, with SQL_ASCII as the
specified source encoding. Per buildfarm.
A large majority of the callers of pg_do_encoding_conversion were
specifying the database encoding as either source or target of the
conversion, meaning that we can use the less general functions
pg_any_to_server/pg_server_to_any instead.
The main advantage of using the latter functions is that they can make use
of a cached conversion-function lookup in the common case that the other
encoding is the current client_encoding. It's notationally cleaner too in
most cases, not least because of the historical artifact that the latter
functions use "char *" rather than "unsigned char *" in their APIs.
Note that pg_any_to_server will apply an encoding verification step in
some cases where pg_do_encoding_conversion would have just done nothing.
This seems to me to be a good idea at most of these call sites, though
it partially negates the performance benefit.
Per discussion of bug #9210.
The length of the output buffer was calculated based on the size of the
argument hstore. On a sizeof(int) == 4 platform and a huge argument, it
could overflow, causing a too small buffer to be allocated.
Refactor the function to use a StringInfo instead of pre-allocating the
buffer. Makes it shorter and more readable, too.
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit. We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue. Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.
Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.
In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass. The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode"). This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.
Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement. Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest. In passing, add check_stack_depth() to prevent stack overflow
in related functions.
Back-patch to 8.4 (all supported versions). The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.
Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.
Security: CVE-2014-0064
We used to have externs for getopt() and its API variables scattered
all over the place. Now that we find we're going to need to tweak the
variable declarations for Cygwin, it seems like a good idea to have
just one place to tweak.
In this commit, the variables are declared "#ifndef HAVE_GETOPT_H".
That may or may not work everywhere, but we'll soon find out.
Andres Freund
postgres_fdw tended to say "unknown error" if it tried to execute a command
on an already-dead connection, because some paths in libpq just return a
null PGresult for such cases. Out-of-memory might result in that, too.
To fix, pass the PGconn to pgfdw_report_error, and look at its
PQerrorMessage() string if we can't get anything out of the PGresult.
Also, fix the transaction-exit logic to reliably drop a dead connection.
It was attempting to do that already, but it assumed that only connection
cache entries with xact_depth > 0 needed to be examined. The folly in that
is that if we fail while issuing START TRANSACTION, we'll not have bumped
xact_depth. (At least for the case I was testing, this fix masks the
other problem; but it still seems like a good idea to have the PGconn
fallback logic.)
Per investigation of bug #9087 from Craig Lucas. Backpatch to 9.3 where
this code was introduced.
The temporary statistics files don't need to be included in the backup
because they are always reset at the beginning of the archive recovery.
This patch changes pg_basebackup so that it skips all files located in
$PGDATA/pg_stat_tmp or the directory specified by stats_temp_directory
parameter.
WalSndKill was doing things exactly backwards: it should first clear
MyWalSnd (to stop signal handlers from touching MyWalSnd->latch),
then disown the latch, and only then mark the WalSnd struct unused by
clearing its pid field.
Also, WalRcvSigUsr1Handler and worker_spi_sighup failed to preserve
errno, which is surely a requirement for any signal handler.
Per discussion of recent buildfarm failures. Back-patch as far
as the relevant code exists.
The buildfarm says commit 58274728fb doesn't
work so well on Windows. This is because the encoding part of Windows
locale names can be just a code page number, eg "1252", which we don't
consider to be a valid encoding name. Add a check to accept encoding
parts that are case-insensitively string equal; this at least ensures
that the new code doesn't reject any cases that the old code allowed.
Even though the server tries to canonicalize stored locale names, the
platform often doesn't cooperate, so it's entirely possible that one DB
thinks its locale is, say, "en_US.UTF-8" while the other has "en_US.utf8".
Rather than failing, we should try to allow this where it's clearly OK.
There is already pretty robust encoding lookup in encnames.c, so make
use of that to compare the encoding parts of the names. The locale
identifier parts are just compared case-insensitively, which we were
already doing. The major problem known to exist in the field is variant
encoding-name spellings, so hopefully this will be Good Enough. If not,
we can try being even laxer.
Pavel Raiskup, reviewed by Rushabh Lathia
Thinko in error report (and a typo in the message text, too). We're
failing anyway, but it would be good to print something useful first.
Noted while reviewing a patch to make pg_upgrade's locale code laxer.
This change allows us to eliminate the previous limit on stored query
length, and it makes the shared-memory hash table very much smaller,
allowing more statements to be tracked. (The default value of
pg_stat_statements.max is therefore increased from 1000 to 5000.)
In typical scenarios, the hash table can be large enough to hold all the
statements commonly issued by an application, so that there is little
"churn" in the set of tracked statements, and thus little need to do I/O
to the file.
To further reduce the need for I/O to the query-texts file, add a way
to retrieve all the columns of the pg_stat_statements view except for
the query text column. This is probably not of much interest for human
use but it could be exploited by programs, which will prefer using the
queryid anyway.
Ordinarily, we'd need to bump the extension version number for the latter
change. But since we already advanced pg_stat_statements' version number
from 1.1 to 1.2 in the 9.4 development cycle, it seems all right to just
redefine what 1.2 means.
Peter Geoghegan, reviewed by Pavel Stehule