Add support for regexps on database and user entries in pg_hba.conf

As of this commit, any database or user entry beginning with a slash (/)
is considered as a regular expression.  This is particularly useful for
users, as now there is no clean way to match pattern on multiple HBA
lines.  For example, a user name mapping with a regular expression needs
first to match with a HBA line, and we would skip the follow-up HBA
entries if the ident regexp does *not* match with what has matched in
the HBA line.

pg_hba.conf is able to handle multiple databases and roles with a
comma-separated list of these, hence individual regular expressions that
include commas need to be double-quoted.

At authentication time, user and database names are now checked in the
following order:
- Arbitrary keywords (like "all", the ones beginning by '+' for
membership check), that we know will never have a regexp.  A fancy case
is for physical WAL senders, we *have* to only match "replication" for
the database.
- Regular expression matching.
- Exact match.
The previous logic did the same, but without the regexp step.

We have discussed as well the possibility to support regexp pattern
matching for host names, but these happen to lead to tricky issues based
on what I understand, particularly with host entries that have CIDRs.

This commit relies heavily on the refactoring done in a903971 and
fc579e1, so as the amount of code required to compile and execute
regular expressions is now minimal.  When parsing pg_hba.conf, all the
computed regexps needs to explicitely free()'d, same as pg_ident.conf.

Documentation and TAP tests are added to cover this feature, including
cases where the regexps use commas (for clarity in the docs, coverage
for the parsing logic in the tests).

Note that this introduces a breakage with older versions, where a
database or user name beginning with a slash are treated as something to
check for an equal match.  Per discussion, we have discarded this as
being much of an issue in practice as it would require a cluster to
have database and/or role names that begin with a slash, as well as HBA
entries using these.  Hence, the consistency gained with regexps in
pg_ident.conf is more appealing in the long term.

**This compatibility change should be mentioned in the release notes.**

Author: Bertrand Drouvot
Reviewed-by: Jacob Champion, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/fff0d7c1-8ad4-76a1-9db3-0ab6ec338bf7@amazon.com
This commit is contained in:
Michael Paquier 2022-10-24 11:45:31 +09:00
parent 5035c93c8a
commit 8fea86830e
3 changed files with 163 additions and 21 deletions

View file

@ -233,11 +233,20 @@ hostnogssenc <replaceable>database</replaceable> <replaceable>user</replaceabl
doesn't match with logical replication connections. Note that physical
replication connections do not specify any particular database whereas
logical replication connections do specify it.
Otherwise, this is the name of
a specific <productname>PostgreSQL</productname> database.
Multiple database names can be supplied by separating them with
commas. A separate file containing database names can be specified by
preceding the file name with <literal>@</literal>.
Otherwise, this is the name of a specific
<productname>PostgreSQL</productname> database or a regular expression.
Multiple database names and/or regular expressions can be supplied by
separating them with commas.
</para>
<para>
If the database name starts with a slash (<literal>/</literal>), the
remainder of the name is treated as a regular expression.
(See <xref linkend="posix-syntax-details"/> for details of
<productname>PostgreSQL</productname>'s regular expression syntax.)
</para>
<para>
A separate file containing database names and/or regular expressions
can be specified by preceding the file name with <literal>@</literal>.
</para>
</listitem>
</varlistentry>
@ -249,7 +258,8 @@ hostnogssenc <replaceable>database</replaceable> <replaceable>user</replaceabl
Specifies which database user name(s) this record
matches. The value <literal>all</literal> specifies that it
matches all users. Otherwise, this is either the name of a specific
database user, or a group name preceded by <literal>+</literal>.
database user, a regular expression (when starting with a slash
(<literal>/</literal>), or a group name preceded by <literal>+</literal>.
(Recall that there is no real distinction between users and groups
in <productname>PostgreSQL</productname>; a <literal>+</literal> mark really means
<quote>match any of the roles that are directly or indirectly members
@ -258,9 +268,18 @@ hostnogssenc <replaceable>database</replaceable> <replaceable>user</replaceabl
considered to be a member of a role if they are explicitly a member
of the role, directly or indirectly, and not just by virtue of
being a superuser.
Multiple user names can be supplied by separating them with commas.
A separate file containing user names can be specified by preceding the
file name with <literal>@</literal>.
Multiple user names and/or regular expressions can be supplied by
separating them with commas.
</para>
<para>
If the user name starts with a slash (<literal>/</literal>), the
remainder of the name is treated as a regular expression.
(See <xref linkend="posix-syntax-details"/> for details of
<productname>PostgreSQL</productname>'s regular expression syntax.)
</para>
<para>
A separate file containing user names and/or regular expressions can
be specified by preceding the file name with <literal>@</literal>.
</para>
</listitem>
</varlistentry>
@ -739,6 +758,14 @@ host all all ::1/128 trust
# TYPE DATABASE USER ADDRESS METHOD
host all all localhost trust
# The same using a regular expression for DATABASE, that allows connection
# to the database db1, db2 and any databases with a name beginning by "db"
# and finishing with a number using two to four digits (like "db1234" or
# "db12").
#
# TYPE DATABASE USER ADDRESS METHOD
local db1,"/^db\d{2,4}$",db2 all localhost trust
# Allow any user from any host with IP address 192.168.93.x to connect
# to database "postgres" as the same user name that ident reports for
# the connection (typically the operating system user name).
@ -785,15 +812,16 @@ host all all 192.168.12.10/32 gss
# TYPE DATABASE USER ADDRESS METHOD
host all all 192.168.0.0/16 ident map=omicron
# If these are the only three lines for local connections, they will
# If these are the only four lines for local connections, they will
# allow local users to connect only to their own databases (databases
# with the same name as their database user name) except for administrators
# and members of role "support", who can connect to all databases. The file
# $PGDATA/admins contains a list of names of administrators. Passwords
# are required in all cases.
# with the same name as their database user name) except for users whose
# name end with "helpdesk", administrators and members of role "support",
# who can connect to all databases. The file $PGDATA/admins contains a
# list of names of administrators. Passwords are required in all cases.
#
# TYPE DATABASE USER ADDRESS METHOD
local sameuser all md5
local all /^.*helpdesk$ md5
local all @admins md5
local all +support md5

View file

@ -293,6 +293,30 @@ free_auth_token(AuthToken *token)
pg_regfree(token->regex);
}
/*
* Free a HbaLine. Its list of AuthTokens for databases and roles may include
* regular expressions that need to be cleaned up explicitly.
*/
static void
free_hba_line(HbaLine *line)
{
ListCell *cell;
foreach(cell, line->roles)
{
AuthToken *tok = lfirst(cell);
free_auth_token(tok);
}
foreach(cell, line->databases)
{
AuthToken *tok = lfirst(cell);
free_auth_token(tok);
}
}
/*
* Copy a AuthToken struct into freshly palloc'd memory.
*/
@ -661,6 +685,10 @@ is_member(Oid userid, const char *role)
/*
* Check AuthToken list for a match to role, allowing group names.
*
* Each AuthToken listed is checked one-by-one. Keywords are processed
* first (these cannot have regular expressions), followed by regular
* expressions (if any) and the exact match.
*/
static bool
check_role(const char *role, Oid roleid, List *tokens)
@ -676,8 +704,14 @@ check_role(const char *role, Oid roleid, List *tokens)
if (is_member(roleid, tok->string + 1))
return true;
}
else if (token_matches(tok, role) ||
token_is_keyword(tok, "all"))
else if (token_is_keyword(tok, "all"))
return true;
else if (token_has_regexp(tok))
{
if (regexec_auth_token(role, tok, 0, NULL) == REG_OKAY)
return true;
}
else if (token_matches(tok, role))
return true;
}
return false;
@ -685,6 +719,10 @@ check_role(const char *role, Oid roleid, List *tokens)
/*
* Check to see if db/role combination matches AuthToken list.
*
* Each AuthToken listed is checked one-by-one. Keywords are checked
* first (these cannot have regular expressions), followed by regular
* expressions (if any) and the exact match.
*/
static bool
check_db(const char *dbname, const char *role, Oid roleid, List *tokens)
@ -719,6 +757,11 @@ check_db(const char *dbname, const char *role, Oid roleid, List *tokens)
}
else if (token_is_keyword(tok, "replication"))
continue; /* never match this if not walsender */
else if (token_has_regexp(tok))
{
if (regexec_auth_token(dbname, tok, 0, NULL) == REG_OKAY)
return true;
}
else if (token_matches(tok, dbname))
return true;
}
@ -1138,8 +1181,13 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel)
tokens = lfirst(field);
foreach(tokencell, tokens)
{
parsedline->databases = lappend(parsedline->databases,
copy_auth_token(lfirst(tokencell)));
AuthToken *tok = copy_auth_token(lfirst(tokencell));
/* Compile a regexp for the database token, if necessary */
if (regcomp_auth_token(tok, HbaFileName, line_num, err_msg, elevel))
return NULL;
parsedline->databases = lappend(parsedline->databases, tok);
}
/* Get the roles. */
@ -1158,8 +1206,13 @@ parse_hba_line(TokenizedAuthLine *tok_line, int elevel)
tokens = lfirst(field);
foreach(tokencell, tokens)
{
parsedline->roles = lappend(parsedline->roles,
copy_auth_token(lfirst(tokencell)));
AuthToken *tok = copy_auth_token(lfirst(tokencell));
/* Compile a regexp from the role token, if necessary */
if (regcomp_auth_token(tok, HbaFileName, line_num, err_msg, elevel))
return NULL;
parsedline->roles = lappend(parsedline->roles, tok);
}
if (parsedline->conntype != ctLocal)
@ -2355,12 +2408,31 @@ load_hba(void)
if (!ok)
{
/* File contained one or more errors, so bail out */
/*
* File contained one or more errors, so bail out, first being careful
* to clean up whatever we allocated. Most stuff will go away via
* MemoryContextDelete, but we have to clean up regexes explicitly.
*/
foreach(line, new_parsed_lines)
{
HbaLine *newline = (HbaLine *) lfirst(line);
free_hba_line(newline);
}
MemoryContextDelete(hbacxt);
return false;
}
/* Loaded new file successfully, replace the one we use */
if (parsed_hba_lines != NIL)
{
foreach(line, parsed_hba_lines)
{
HbaLine *newline = (HbaLine *) lfirst(line);
free_hba_line(newline);
}
}
if (parsed_hba_context != NULL)
MemoryContextDelete(parsed_hba_context);
parsed_hba_context = hbacxt;

View file

@ -81,6 +81,14 @@ $node->safe_psql(
GRANT ALL ON sysuser_data TO md5_role;");
$ENV{"PGPASSWORD"} = 'pass';
# Create a role that contains a comma to stress the parsing.
$node->safe_psql('postgres',
q{SET password_encryption='md5'; CREATE ROLE "md5,role" LOGIN PASSWORD 'pass';}
);
# Create a database to test regular expression.
$node->safe_psql('postgres', "CREATE database regex_testdb;");
# For "trust" method, all users should be able to connect. These users are not
# considered to be authenticated.
reset_pg_hba($node, 'all', 'all', 'trust');
@ -200,6 +208,40 @@ append_to_file(
test_conn($node, 'user=md5_role', 'password from pgpass', 0);
# Testing with regular expression for username. The third regexp matches.
reset_pg_hba($node, 'all', '/^.*nomatch.*$, baduser, /^md.*$', 'password');
test_conn($node, 'user=md5_role', 'password, matching regexp for username',
0);
# The third regex does not match anymore.
reset_pg_hba($node, 'all', '/^.*nomatch.*$, baduser, /^m_d.*$', 'password');
test_conn($node, 'user=md5_role',
'password, non matching regexp for username',
2, log_unlike => [qr/connection authenticated:/]);
# Test with a comma in the regular expression. In this case, the use of
# double quotes is mandatory so as this is not considered as two elements
# of the user name list when parsing pg_hba.conf.
reset_pg_hba($node, 'all', '"/^.*5,.*e$"', 'password');
test_conn($node, 'user=md5,role', 'password', 'matching regexp for username',
0);
# Testing with regular expression for dbname. The third regex matches.
reset_pg_hba($node, '/^.*nomatch.*$, baddb, /^regex_t.*b$', 'all',
'password');
test_conn(
$node, 'user=md5_role dbname=regex_testdb', 'password,
matching regexp for dbname', 0);
# The third regexp does not match anymore.
reset_pg_hba($node, '/^.*nomatch.*$, baddb, /^regex_t.*ba$',
'all', 'password');
test_conn(
$node,
'user=md5_role dbname=regex_testdb',
'password, non matching regexp for dbname',
2, log_unlike => [qr/connection authenticated:/]);
unlink($pgpassfile);
delete $ENV{"PGPASSFILE"};