Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure upstream repo is configured or absent depending on use_upstream_repo #123

Merged
merged 2 commits into from
Sep 23, 2016

Conversation

vutny
Copy link
Contributor

@vutny vutny commented Sep 20, 2016

Hi @gravyboat

The main feature of this PR is to make sure that the postgres.upstream state removes Postgres repository configuration and GnuPG key if the postgres:use_upstream_repo Pillar set to False.
Now such behavior is documented in README.

This PR also resolves issue #58 one more time, there were regressions in previous changes.

Additional improvements:

  • Make the postgres.upstream state more reusable, other states could safely use sls requisite.
  • On Debian/Ubuntu, fetch GPG key from PostgreSQL website, because keyserver port is often blocked by firewalls
  • Display message about repo unavailability for systems other than RHEL and Debian based.

@vutny
Copy link
Contributor Author

vutny commented Sep 21, 2016

Does someone have any comments, objections, test failures? ;)

@vutny vutny force-pushed the remove-upstream-repo branch from 855e31b to 2ed71f6 Compare September 21, 2016 09:08

{% if postgres.use_upstream_repo %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you remove the conditional include? If I set postgres.use_upstream_repo = False, makes no sense to me to force a dependency on a state that sets and manages something I explicitely said don't want to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, my PR is all about to make upstream.sls stateful.

And it makes sense when you remove upstream repository configuration.

{% endif %}
- pkgs: {{ pkgs }}
- refresh: {{ 'pkg_repo' in postgres }}
- require:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as in my previous comment, this requirement should either be conditional or, better yet, a require_in set in the install-postgresql-repo in the postgres/upstream.sls file, don't you agree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include and requirement statements are enforcing upstream.sls to clean up repository configuration before doing any further installation.


{{ codename|default(name, true) }}:
# PostgreSQL packages are mostly downloaded from `main` repo component
pkg_repo: 'deb http://apt.postgresql.org/pub/repos/apt/ {{ name }}-pgdg main {{ version }}'
pkg_repo_humanname: PostgreSQL Official Repository
pkg_repo: 'deb http://apt.postgresql.org/pub/repos/apt {{ name }}-pgdg main {{ upstream_version }}'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave {{ version }} here, as it's clearly being set in the preceeding lines you added, and is more consistent.

Copy link
Contributor Author

@vutny vutny Sep 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because pkg_repo parameter is only applicable for upstream repository, there is no point to set it's component name to the version provided by OS distribution.
But the main point here is that this variable also playing its part in removing upstream repository configuration when use_upstream_repo set to False.

Basically, upstream.sls will make sure that upstream deb source record and GPG key are removed. Because package names from upstream could possibly match the ones from the "stock", it's really important to avoid installation of newer tools or libs which would be incompatible with PostgreSQL server installed from distribution archive.

@javierbertoli
Copy link
Member

Sorry, busy days :)

All in all, I agree with the idea of this refactoring, but don't like the forced dependency that you now set to the upstream.sls file. And in that file, I think that enclosing all the code in a conditional if use_upstream_repo is error prone and confusing:

I expect that, if I explicitely include the upstream.sls state in my top.sls file, it will just work and configure the repo.

Now, with the conditional surrounding all the code, I have to both include the sls file and set use_upstream_repo to true.

I personally prefer (and consider it clearer) if you just let upstream.sls do its work, and then include it conditionally where needed (as it was being done before).

If this behaviour has any errors as mentioned in #58, just fix that?

@vutny
Copy link
Contributor Author

vutny commented Sep 21, 2016

@javierbertoli Thanks for your valuable feedback!

Indeed, I tried to make upstream.sls more reusable for the other states which install the packages and at first iteration I've removed direct requisites. But later, looking at the git repo history I found that previously it was designed that package installation states should require repo configuration state if use_upstream_repo set to True.

That seems good and reasonable, but again, require_in statement for the postgresql-repo could only reference a single pkg state, which was postgresql-installed previously:

install-postgresql-repo:
...
  pkgrepo.managed:
    - name: {{ postgres.pkg_repo }}
    - humanname: {{ postgres.pkg_repo_humanname }}
    - baseurl: {{ postgres.pkg_repo_url }}
    - gpgcheck: 1
    - gpgkey: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
     - require:
      - file: install-postgresql-repo
    - require_in:
      - pkg: postgresql-installed

It was added to ensure that packages are not going to be installed if repo configuration has failed.

Obviously, this case doesn't work for other states such as client.sls if you would try to execute it standalone. Requisite for postgresql-installed would not be found. But if I change pkg state ID in client.sls to the same name, there would be conflict if init.sls and client.sls become parts of single highstate.

That's why I had to redesign this requisite logic.

I've moved the conditional check for the postgres.use_upstream_repo variable to the upstream.sls file, and included it unconditionally in init.sls and client.sls with adding a whole SLS requirement. By doing that I've escaped reverse hard dependency on any particular state in the upstream.sls, no matter what will be rendered inside it. It just need to return True.

I believe this should be much more clear and reliable. As opposite, by having if statements around include and require, the same repeated logic spreads across SLS files to wrap each pkg state. That doesn't look good and hard to observe and maintain.

Since the upstream.sls itself starts to honor the use_upstream_repo setting it finally becomes as a separate reusable state! And it could be included in top.sls.

Even more, if you have use_upstream_repo set to False, any packages would be installed only after the upstream.sls would make sure there is no upstream repo configured and local metadata was updated on supported systems, because this is your real desire, right? ;-)

If you have any additional questions, I would be happy to answer them.

P.S. I will respond to some of you in-line comments to express more details. Thanks.

@javierbertoli
Copy link
Member

That seems good and reasonable, but again, require_in statement for the postgresql-repo could only > reference a single pkg state

As I understand it, require_in can reference multiple states, so this

install-postgresql-repo:
  file.managed:
    - name: /etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
    - source: https://download.postgresql.org/pub/repos/yum/RPM-GPG-KEY-PGDG
    - source_hash: md5=78b5db170d33f80ad5a47863a7476b22
  pkgrepo.managed:
    - name: {{ postgres.pkg_repo }}
    - humanname: {{ postgres.pkg_repo_humanname }}
    - baseurl: {{ postgres.pkg_repo_url }}
    - gpgcheck: 1
    - gpgkey: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
    - require:
      - file: install-postgresql-repo
    - require_in:
      - pkg: postgresql-installed
      - pkg: install-postgresql-client
      - pkg: install-postgres-libpq-dev

would be perfectly possible, keep upstream.sls independently usable and force the dependency when included. If that's not correct, can you point me to the documentation where it says that you can require_in just a single pkg state?

@vutny
Copy link
Contributor Author

vutny commented Sep 21, 2016

@javierbertoli No-no! Of course you can put as many requisites to the state as you want.

I've meant that such trick doesn't resolve the current issue with missing requirements for states which not included.

If I will try to apply only states from client.sls (I don't need server installed on some Minion), I will get SLS rendering error in this case:

[ERROR   ] Data passed to highstate outputter is not a valid highstate return:
{'local': ["Cannot extend ID 'postgresql-installed' in 'base:postgres.upstream'. It is not part of the high state.\n
This is likely due to a missing include statement or an incorrectly typed ID.\n
Ensure that a state with an ID of 'postgresql-installed' is available\n
in environment 'base' and to SLS 'postgres.upstream'"]}

The similar message I will have when try to apply just init.sls (salt-call state.apply postgres), because client states are missing.

This makes the states very much coupled with one another. That's why require_in is not an option here.

@vutny vutny force-pushed the remove-upstream-repo branch from 2ed71f6 to 771836b Compare September 22, 2016 08:51
@vutny
Copy link
Contributor Author

vutny commented Sep 22, 2016

@javierbertoli I understood my mistake. I've tried to kill two birds with one stone and have fallen into over-engineering... Thanks for your help to get it! :)

@gravyboat
I've rebased my changes on #124, which simply fixes issue #58 in simple and clear way.

Now my PR only does one thing -- it makes upstream.sls stateful.
If you apply the state explicitly (or include it in top.sls), the upstream repo will be removed or configured depending on postgres.use_upstream_repo Pillar setting. That's all about it.

Could you please review the changes? Thanks!

@javierbertoli
Copy link
Member

Cool! (Thanks for #124 btw 😄). So now, upstream.sls is fully independent and just manages the repo. If I include it, I'll manage the repo configuration.

Then, how about this:

  1. Default use_upstream_repo to True. Reasons:

    a. Unless I explicitly use this state file, use_upstream_repo will be ignored, even if set.
    b. If included explicitly, I'd expect it to configure and manage the repo. And then, and only then:
    c. If included and explicitly setting use_upstream_repo to False I'd expect it to remove the repo.

  2. In upstream.sls, I'd reorder code a little, evaluating the conditional for true first. I think it makes things clearer (and read somewhere, sometime, that evaluating conditionals for true values first is a good coding practice 😄):

{%- if grains['os_family']  in ('RedHat', 'Debian') %}
  # We manage the repo for only these distros
 {%- if postgres.use_upstream_repo -%}
  # If true, we add it
    {%- if grains['os_family'] == 'Debian' %}
    # Debian specific stuff
    {%- elif grains['os_family'] == 'RedHat' %}
    # RedHat specific stuff
    {% endif %}
    # Common things to do when adding the repo
  {% else %}
    # We remove the repo
  {% endif %}
{% else %}
# We notify that we don't manage this distro
test.show_notification:
  - text: |
    PostgreSQL does not provide package repository for {{ grains['os_family'] }}
{% endif %}

What do you think?

@vutny
Copy link
Contributor Author

vutny commented Sep 22, 2016

@javierbertoli Awesome!

  1. 👍 As an operator I'd expect the formula to install latest stable version of the software which is available (better from official source), and to be able to customize my needs using Pillar. Great idea!

  2. Your structure of conditionals looks pretty and much more readable. I like it!
    Just one thing, instead of checking for the supported system in tuple:

    {%- if grains['os_family']  in ('RedHat', 'Debian') %}

    I'd rather rely on postgres.pkg_repo variable. If it is defined in *map.yaml files for a particular platform, then we are good to go further:

    {%- if 'pkg_repo' in postgres %}

    Other systems could be added in the future (Solaris?) and I think it's better to hide logic deciding which distros could be managed in the lookup maps. For example, PostgreSQL repo doesn't have builds specific to Ubuntu 15.04 (Vivid Vervet). But the state will add it anyway, because it's Debian family. And that should be fixed...

I will adjust my code according to your suggestions. And your review was terrific! 😄 Thank you!

Any additional thoughts?

@vutny vutny force-pushed the remove-upstream-repo branch from 771836b to b4edec9 Compare September 22, 2016 14:05
@javierbertoli
Copy link
Member

@vutny, I agree with your suggestions. Perhaps, to cover the cases you mention (about Ubuntu/Debian and similar that might arise), we'd rather use os instead of os_family here, so we are more specific? Like:

{%- if 'pkg_repo' in postgres %}
  # We manage the repo for only these distros
 {%- if postgres.use_upstream_repo -%}
  # If true, we add it
    {%- if grains['os'] in ('Debian', 'some_debian_derivative', 'some_other') %}
    # Debian and derivatives that share stuff
    {%- elif grains['os'] in ('RedHat', 'CentOS') %}
    # RedHat and CentOS specific stuff
    {% endif %}
...
...    

@vutny
Copy link
Contributor Author

vutny commented Sep 23, 2016

@javierbertoli I would like to completely get rid of OS selection logic in the SLS file.
We have osmap.yaml flle which simply takes care about the platform specific things. And it could be easily converted to filter by the os Grain instead of the os_family if there will be too much difference between derivatives.

I've changed and pushed the code which abstracts pkg.managed state arguments for supported OSes.
Please could you be so kind to take a look and tell me is that OK?

In the meanwhile I'll try to set default use_upstream_repo = True in a single place...

@javierbertoli
Copy link
Member

Mmmm... I don't see it. maybe you forgot to git push? ("Oh, Fridays!" 😄)

@vutny
Copy link
Contributor Author

vutny commented Sep 23, 2016

@javierbertoli Actually I've pushed it yesterday 😄 by just amending my commit b4edec9. Please try to refresh the page or do git clone again.

@vutny
Copy link
Contributor Author

vutny commented Sep 23, 2016

Allright! I've set use_upstream_repo: True by default with separate commit.
@javierbertoli Waiting for your kind approval. 🙏

@javierbertoli javierbertoli merged commit ea8c88c into saltstack-formulas:master Sep 23, 2016
@javierbertoli
Copy link
Member

Merged it. Great work, @vutny ! :)

Going through this iteration, made me realize that the '*yaml' files are carrying too much logic for files that must let you, at a quick glance, understand what values are being set as default or overriden in each of those files. I'm adding an issue (#125) to address this, with more details.

@vutny vutny deleted the remove-upstream-repo branch September 26, 2016 10:59
@vutny
Copy link
Contributor Author

vutny commented Sep 26, 2016

Thank you @javierbertoli !

Yep, I agree, it appears too much Jinja code in the lookup maps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants