Page MenuHomeFreeBSD

ping6: separate command line tokens parsing from processing of the option arguments
Needs ReviewPublic

Authored by sucanjan_gmail.com on Aug 27 2019, 12:48 PM.

Details

Reviewers
asomers
Summary

Separate command line tokens parsing from processing of the option arguments

With this it's easier to add an alternative option set getopt loop because the code for processing the arguments isn't contained in the loop so it's not duplicated.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 26256
Build 24746: arc lint + arc unit

Event Timeline

I would also like to ask for help with copyright. What should I use if some code is not mine (mainly the #define directives) and some is?

I would also like to ask for help with copyright. What should I use if some code is not mine (mainly the #define directives) and some is?

If you're moving it from another file, then move or copy that file's copyright header. If you're also substantially adding to it, you can add another copyright in your name.

sbin/ping6/options.h
2

Don't forget the copyright header.

asomers added inline comments.Aug 27 2019, 3:50 PM
sbin/ping6/options.c
158

Do you really save much code by handling optarg down here? This obvious method, handling optarg in each case statement, might be more verbose, but I think it would make the code in ping6.c clearer. It would allow the struct options_processed to have a more obvious and self-documenting structure, using appropriate data types for each option.

Add copyright information to options.c and options.h.
Use appropriate data types for each option of struct options_processed.

sbin/ping6/options.c
158

Done.

asomers added inline comments.Aug 27 2019, 9:12 PM
sbin/ping6/options.c
158

Ultimately there will be two different consumers of options_parse, right? If so, then there's going to be duplicate code for stuff like string->int conversion. I think it would be a good idea to move that stuff into options_parse.

sbin/ping6/options.c
158

I agree (I suppose that the two consumers are ping and ping6). I would like to continue sending the changes from the unified option parsing in my branch at GitHub for a review. It contains code for conversion of the arguments. Could it be done in this way?

asomers added inline comments.Aug 29 2019, 2:44 PM
sbin/ping6/options.c
158

I don't understand. Could what be done in what way?

sbin/ping6/options.c
158

To me it seemed that you would like this diff to contain the string->int conversion. I think it would be better to implement it in a separate diff (the changes are ready in my branch on GitHub) because this one is a dependency the option compatibility getopt loop which will not cause duplication of the conversion code.

asomers added inline comments.Aug 29 2019, 3:49 PM
sbin/ping6/options.c
158

Are you trying to see that the natural order for patches would be:

  1. This one
  2. Backwards-compatible ping6 options
  3. consolidate string->int conversion here?
sbin/ping6/options.c
158

Something like that.

  1. This diff.
  2. Merge ping and ping6. Because if we applied the patch adding the backward-compatibility here, it would take precedence over the new option names (because the name of the executable would still be ping6) and it could break the scripts that already migrated to the new options.
  3. Backwards-compatible ping6 options.
  4. From here I will start unifying processing of the options one by one so it's easy to review the changes. This will involve use of the struct options like it is used in my GSoC branch.

I have a question concerning step 4. This diff uses struct options_processed. Maybe more appropriate name would be 'struct options_to_process' or something like that. This structure will be private for the option parsing code. It is used to carry information between the two stages of option parsing: detecting options with their arguments, and processing those arguments. The struct options will be used for holding the parsed arguments and will be use outside the option parsing code, in ping and ping6 code. I will be glad to use some better naming, if you give me some suggestions. :-)

asomers added inline comments.Aug 29 2019, 7:10 PM
sbin/ping6/options.c
158

Regarding step 4, I still don't see a compelling reason to separate string->integer conversion from getopt. I would just combine them, and call the structure struct options.

sbin/ping6/options.c
158

Should checking for conversion errors be a part of the getopt loop too?

asomers added inline comments.Aug 31 2019, 4:29 AM
sbin/ping6/options.c
158

You mean like the kind of error you would get from ping -c not_an_integer? Yes, I think so.

sbin/ping6/options.c
158

OK. Syntactic checks will be in getopt loop, semantic checks will remain in main().

Sort SRCS in the Makefile alphabetically.
Move string-to-number conversion for -b option to options.c.

sbin/ping6/options.c
158

I did it only for -b option as a preview. Could you please check it?

Yes, that's about what I had in mind. But did you intend to set f_sock_buff_size in the getopt loop?

Yes, it's intended. When a pointer to an argument was an output from the getopt loop, it contained information about the value and whether the user provided the argument (there were basically two variables: the pointer itself and data which it was pointing to). When a number is the output, the information about providing an argument is contained in the boolean flag. Another solution would be to restrict range of values returned by strtoul (or use a wider data type) so it's possible to differentiate between return value of strtoul and a value (outside that range) to which the variable is initialized by default.

Rename struct options_processed to struct options.
Move all strto* argument conversion code to options.c.

asomers added inline comments.Sep 3 2019, 3:22 PM
sbin/ping6/options.c
164

Here's another wart. ping allows the -c argument to be in any base, but ping6 forces base 10. It would be good to fix that in a separate commit.

sbin/ping6/ping6.c
365

This check is redundant with the one in options.c

Remove semantic check of the -c option from options.c.

sbin/ping6/options.c
164

I will fix it in the next diff.

sbin/ping6/ping6.c
365

Thanks. Fixed.

So what's the plan for ping -6 vs ping? Are you going to have two different copies of options_parse?

Well, I'm not sure yet. I see two solutions:

  • to have two different copies of options_parse. A drawback is that even those parts of the getopt loop that don't need to be duplicated would be duplicated.
  • to have another getoptloop only for translating/replacing option names. This loop would be called first when the program is executed in compatibility mode. Then, the options would be processed by the main getop loop. This solution is possible because the sets of options have the same size and there is only renaming of the options.

However, there is a small issue that affects both of the solutions. Some of the error messages include option names. An error message containing incorrect option name could be produced. In the first solution it would be a semantic error message in ping6.c (both of the getopt loops would have the information about the actual names so they would produce correct error messages). In the second solution an incorrect error message would be produced even earlier; in the main getopt loop.
Solutions to this issue:

  • to save information about the actual name of an option and use it as a parameter of the error message.
  • to modify the error messages not to include option names.

From a user point of view, I would vote for the first solution. Having an option name included in an error message is a great help when identifying what went wrong.

I would like to know your opinion.

Since only a few options have changed, what about defining constants like #DEFINE ADDRTYPE_OPT 'k' before including options.h, and composing the getopt string like we already do with ADDOPTS. Then in options.c, for those options whose flags differ between the two programs, compose the error message at runtime.

I need to be sure I understand you correctly.

Since only a few options have changed, what about defining constants like #DEFINE ADDRTYPE_OPT 'k' before including options.h, and composing the getopt string like we already do with ADDOPTS.

Does it mean that there could be only one getopt loop with a variable getopt string for solving the compatibility problem?

Then in options.c, for those options whose flags differ between the two programs, compose the error message at runtime.

In order to construct the message at runtime information about which name for the option was used on the command line will be needed. The name could be saved in a char variable and used in the error message.

I need to be sure I understand you correctly.

Since only a few options have changed, what about defining constants like #DEFINE ADDRTYPE_OPT 'k' before including options.h, and composing the getopt string like we already do with ADDOPTS.

Does it mean that there could be only one getopt loop with a variable getopt string for solving the compatibility problem?

Yes, exactly.

Then in options.c, for those options whose flags differ between the two programs, compose the error message at runtime.

In order to construct the message at runtime information about which name for the option was used on the command line will be needed. The name could be saved in a char variable and used in the error message.

It's already saved in the variable ch.

Does it mean that there could be only one getopt loop with a variable getopt string for solving the compatibility problem?

Yes, exactly.

In this case there would have to be some syntactic check duplication in the getopt's switch-case. Some options were renamed according to the pattern (A ->B, B->C) so one case would contain syntactic check for option A and the other for both B and A. If only the case labels could have been non-constant.

In order to construct the message at runtime information about which name for the option was used on the command line will be needed. The name could be saved in a char variable and used in the error message.

It's already saved in the variable ch.

Yes, I forgot that. However, this information would be only for currently processed option. It would be needed also for semantic check error messages in ping6.c after the while(getopt()) loop would have been exited.

I will try to think about how to solve these two issues. Maybe results from this discussion could also be beneficial to some other utilities (e.g., merging traceroute6 to traceroute).

Does it mean that there could be only one getopt loop with a variable getopt string for solving the compatibility problem?

Yes, exactly.

In this case there would have to be some syntactic check duplication in the getopt's switch-case. Some options were renamed according to the pattern (A ->B, B->C) so one case would contain syntactic check for option A and the other for both B and A. If only the case labels could have been non-constant.

True, the case labels must be compile-time constants, but they needn't be literals. If you #define ADDRTYPE_OPT 'k' somewhere, then you can case ADDRTYPE_OPT.

The single getopt loop would have to be parametrized at runtime. How do the preprocessor directives fit into this?

The single getopt loop would have to be parametrized at runtime. How do the preprocessor directives fit into this?

You can move all of the getopt stuff into options.h, and then define the variable flags _before_ including options.h, like this:

ping6.c

#define ADDRTYPE_FLAG 'a'
#include "options.h"

ping.c

#define ADDRTYPE_FLAG 'k'
#include "options.h"

Aha, to me it's similar to C++ templates. So there would be only one getopt loop at a source level, and two getopt loops at an object level. The two getopt loops (for the ping6's new option set and for the old one) would be called from a single source file like this:

if (strcmp(argv[0], "ping6") == 0)
    parse_old_options();
else
    parse_new_options();

so the name of the parse function would also need to be parametrized to avoid conflicts of symbols at the object level.

Aha, to me it's similar to C++ templates. So there would be only one getopt loop at a source level, and two getopt loops at an object level. The two getopt loops (for the ping6's new option set and for the old one) would be called from a single source file like this:

if (strcmp(argv[0], "ping6") == 0)
    parse_old_options();
else
    parse_new_options();

so the name of the parse function would also need to be parametrized to avoid conflicts of symbols at the object level.

Would it be easier if you used an actual C++ template? That's an option. Utilities in the base system are allowed to use C++98.

I would like to stick with C, but after some basic experiments it seems that it's not possible to use a C preprocessor to parametrize function names. I'm going to do some research on this.

The parametrization using macros is possible. However, it probably means having the loop defined in a header file and not using an include guard in it so it can be included twice in a single source file. The single parametrized loop would contain cases for options of the two loops (new and compat). If some new options are added in the future, it would be more clear to a reader to have two separate loops at a source level.

Should I use the parametrization? I'm asking just to be completely sure that it's the way to go and to minimize the diff rewriting.