Ticket #357 (new cage)

Opened 13 years ago

Last modified 11 years ago

Enable meaningful testing of t/native_pbc/*.t

Reported by: doughera Owned by:
Priority: normal Milestone:
Component: testing Version:
Severity: medium Keywords: 64bit needreview testing
Cc: Language:
Patch status: applied Platform:

Description

Version 0.9.1 was released with failing t/native_pbc/*.t tests. If the tests are to be included in 1.0, it would be nice to have confidence that they will pass in the released version.

Offhand, I can think of three big problems that contributed to the failure. I hasten to add that *all* of these were out of the control of the release manager -- they are inherent in the design of the tests.

1. The tests act differently in "DEVELOPING" versus released directories, meaning all the tests done in an svn-checkout -- even those done just prior to release -- weren't necessarily relevant.

2. The release procedure is difficult in practice. It is not easy, on short notice, to get developers with all the requisite different platforms to run the appropriate scripts, commit the appropriate files, and re-run all the appropriate tests. Also, because of item 1, even running all the appropriate tests does not ensure that the tests will actually pass in the released version.

3. There was no meaningful code freeze prior to the release, making it essentially impossible to do the coordination necessary for item 2.

What to do?

First, if the tests are to be kept, they must be designed so that they can be meaningfully run well in advance of the release. That way, new raw pbc files can be generated ahead of time, committed, and tested, and failures can be addressed. If they cannot be redesigned this way, then I think they are too fragile to waste time on, and they should simply be deleted.

Second, I think the ideas of having a code freeze and issuing release candidates merit serious consideration prior to 1.0.

Attachments

tt357-better_native_pbc_t.patch Download (27.3 KB) - added by rurban 13 years ago.
now independent of TT#254 64bit fixes

Change History

in reply to: ↑ description   Changed 13 years ago by jkeenan

Replying to doughera:

I strongly agree with Andy Dougherty's comments. My impression is that these tests don't stay fixed once we've fixed them, and they don't stay TODO-ed when we TODO them. Now, I know that's not literally true, but if you don't stay on #parrot all day long, that's the impression you might get.

Version 0.9.1 was released with failing t/native_pbc/*.t tests. If the tests are to

[snip]

1. The tests act differently in "DEVELOPING" versus released directories, meaning all the tests done in an svn-checkout -- even those done just prior to release -- weren't necessarily relevant.

We've long had the ability to write tests such that they respond one way if we're DEVELOPING and another way if we're not. Parrot::Revision works in this way, so I wrote tests which take that into account.

t/configure/018-revision_to_cache.t
t/configure/061-revision_from_cache.t
t/configure/017-revision_from_cache.t

I suspect that the t/native_pbc/*.t tests could be adapted in the same way.

2. The release procedure is difficult in practice. It is not easy, on short notice, to get developers with all the requisite different platforms to run the appropriate scripts, commit the appropriate files, and re-run all the appropriate tests. Also, because of item 1, even running all the appropriate tests does not ensure that the tests will actually pass in the released version.

Agreed.

[snip] What to do? First, if the tests are to be kept, they must be designed so that they can be meaningfully run well in advance of the release.

See above for part of the solution.

Thank you very much.
kid51

  Changed 13 years ago by doughera

  • milestone 1.0 deleted

(I removed the "Milestone 1.0" setting -- that should probably be reserved for much more important things.)

  Changed 13 years ago by coke

There is an open question as to whether cross-platform bytecode is a must-have for 1.0. While this is just for the testing of that feature, we can't do the test properly until the feature is working...

(However, given the time remaining until 1.0, I'm assuming that we're not going to get this feature set before then.)

  Changed 13 years ago by rurban

tools/dev/pbc_header.pl broke the test files.

There should have been enough time to get current darwin/ppc _3.pbc files.

The procedure described in the release_manager guide explicitly says:

Please check with C<prove t/native_pbc/*.t>

Maybe it needs to be an easier to follow description how to update those pbc's. But tools/dev/mk_native_pbc is much easier now, than before.

$ svn revert t/native_pbc/integer_1.pbc
Reverted 't/native_pbc/integer_1.pbc'

rurban@reini /usr/src/perl/parrot/parrot-svn
$ perl tools/dev/pbc_header.pl t/native_pbc/integer_1.pbc
t/native_pbc/integer_1.pbc
        wordsize      =   4
        byteorder     =   0
        floattype     =   0
        parrot_major  =   0
        parrot_minor  =   9
        parrot_patch  =   1
        bc_major      =   3
        bc_minor      =  34
        uuid_type     =   0
        uuid_size     =   0
        pad           =  14
        dir_format    = 1, 0, 0, 0

rurban@reini /usr/src/perl/parrot/parrot-svn
$ perl tools/dev/pbc_header.pl --upd t/native_pbc/integer_1.pbc

rurban@reini /usr/src/perl/parrot/parrot-svn
$ perl tools/dev/pbc_header.pl t/native_pbc/integer_1.pbc
t/native_pbc/integer_1.pbc
        wordsize      = 108
        byteorder     =  38
        floattype     = 116
        parrot_major  = 109
        parrot_minor  = -51
        parrot_patch  = 116
        bc_major      = -34
        bc_minor      =  90
        uuid_type     =   0
        uuid_size     =   0
        pad           =  14
        dir_format    = 1, 0, 0, 0

  Changed 13 years ago by rurban

  • status changed from new to assigned
  • owner set to rurban

I've fixed the broken tools/dev/pbc_header.pl --upd with r36890. The writer was completely rewritten.

I've also added the long planned UUID fingerprint feature, so that PBC changes are also detected during devel cycles, once the PBC_COMPAT file is updated.

So the tests can be relaxed if those TODOs are too annoying.

The library packfile reader does not yet check this UUID though.

  Changed 13 years ago by rurban

  • milestone set to 1.0

Enabled Milestone 1.0 again, as it is rather important that we get no passing TODO tests on the 1.0 release.

And the fix is really simple.

  Changed 13 years ago by rurban

  • patch set to new

Attached is the patch to check the bytecode version for the native pbcs, and to skip if failing.

No todo, as the pbc's are now portable with the patch in TT #254.

  Changed 13 years ago by rurban

Rewrite number.t with new attachment. Added proper coverage testmatrix for number tests, for the various converters.

  Changed 13 years ago by rurban

  • keywords need review added
  • type changed from todo to cage

Changed 13 years ago by rurban

now independent of TT#254 64bit fixes

  Changed 13 years ago by rurban

  • patch changed from new to applied

Applied as r37045.

t/native_pbc/*.pbc need to be regenerated after the PMC deprecations are finished.

  Changed 13 years ago by rurban

  • keywords 64bit needreview added; need review removed

  Changed 13 years ago by rurban

  • status changed from assigned to closed
  • resolution set to fixed

Fixed with r37077 and status update with r37077. This is now meaningful, agreed?

  Changed 13 years ago by doughera

  • status changed from closed to reopened
  • resolution fixed deleted

I have reopened this ticket because you have not addressed the main problem: The tests done during normal development testing are not the same as those done on a release. To be specific: Step 2g of the release manager's guide instructs the release manager to change all the *.pbc files. Thus the files in the release will not be the ones that were tested.

  Changed 13 years ago by allison

Unfortunately, these tests are not stable or robust enough yet. I'm returning all the t/native_pbc/* tests to their previous skipped state. They will remain skipped until we come up with a more robust testing strategy for bytecode portability.

I'll also remove the instructions from the release manager's guide. The release manager's guide represents the group's decisions on a stable, repeatable process for shipping releases. A change that affects the stability of releases must be reviewed by the group.

  Changed 13 years ago by allison

  • milestone 1.0 deleted

Tests skipped in r37254 and release manager guide modified in r37255.

  Changed 13 years ago by rurban

Allison, please provide arguments about the claim that these tests are not stable or robust enough yet. They are young, but all known tests pass, and some tests on unknown platforms are todo.

Your change did not changed it back to the previous skipped state, header was always passing, and the skip message says now "pending robust testing strategy, TT #357"

How should a robust test strategy look like? Get 100% coverage first? Good. Lets skip all todo tests then for a major release (? => 0)

But disabling these tests will not lead to more robust tests, it will lead to less coverage, less robust tests, with the same bugs in core. I've added the previous state commented out, so that at least willing parties can help in adding test coverage.

Good that the release_manager guide was changed, because the wrong tool broke the tests in 0.9.1 and it is not needed anyway.

  Changed 13 years ago by allison

All the native_pbc tests were skipped until you unskipped them 6 months ago. They will all remain skipped.

The tests that we run on the repository in the weeks (and months) before the release must give an accurate measure of whether the tests will pass in the released tarball. In the current process, platform testing before the release is rendered meaningless by fundamental changes to what's tested, made just before the tarball is created. That's simply not acceptable.

  Changed 13 years ago by doughera

There was discussion on the parrot-dev list about this, expanding on the point I repeatedly tried to make above. Here's one specific reference:  http://lists.parrot.org/pipermail/parrot-dev/2009-March/001727.html. This was the state as of 4 days ago, a mere 8 days before the 1.0 release.

From your reply at  http://lists.parrot.org/pipermail/parrot-dev/2009-March/001730.html, it is not clear to me whether or not the tests have been redesigned. It was clear to me that although you had marked the problem as "fixed", the main issue was most definitely not fixed.

It may be that the tests have been redesigned now. I don't know. It is too late for this release, in my opinion. The t/native_pbc/*.pbc files keep changing. There isn't time prior to release to give them wide testing, fix any issues, and give those fixes wide testing.

  Changed 12 years ago by coke

  • owner rurban deleted
  • status changed from reopened to new

  Changed 11 years ago by tcurtis

  • keywords testing added

  Changed 11 years ago by jkeenan

  • component changed from none to testing
Note: See TracTickets for help on using tickets.